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MYCOBACTERIUM TUBERCULOSIS DNA SEQUENCES ENCODING 
IMMUNOSTIMULATORY PEPTIDES 

CROSS REFERENCE TO RELATED CASES 
5 This application claims the benefit of U.S. Provisional Application No. 60/000,254. filed June 15, 1995, 

which is incorporated herein by reference. 
I. BACKGROUND 

A. THE RISE OF TUBERCULOSIS 

Over the past few years the editors of the Morbidity and Mortality Weekly Report have chronicled the 
10 unexpected rise in tuberculosis cases. It has been estimated that worldwide there are one billion people infected 
with M. tuberculosis, with 7.5 million active cases of tuberculosis. Even in the United States, tuberculosis 
continues to be a major problem especially among the homeless. Native Americans, African- Americans, 
immigrants, and the elderly. HIV-infected individuals represent the newest group to be affected by tuberculosis. 
Of the 88 million new cases of tuberculosis expected in this decade approximately 10% will be attributable to HIV 
15 infection. 

The emergence of multi-drug resistant strains of M. tuberculosis has complicated matters further and even 
raises the possibility of a new tuberculosis epidemic. In the U.S. about 14% of M. tuberculosis isolates are 
resistant to at least one drug, and approximately 3% are resistant to at least two drugs. M. tuberculosis strains 
have even been isolated that are resistant to all seven drugs in the repertoire of drugs commonly used to combat 

20 tuberculosis. Resistant strains make treatment of tuberculosis extremely difficult: for example, infection with M. 
tuberculosis strains resistant to isoniazid and rifampin leads to mortality rates of approximately 90% among HIV- 
infected individuals. The mean time to death after diagnosis in this population is 4-16 weeks. One study reported 
that of nine immunocompetent health care workers and prison guards infected with drug resistant M. tuberculosis. 
Five died. The expected mortality rate for infection with drug sensitive M. tuberculosis is 0%. 

25 The unrelenting persistence of mycobacterial disease worldwide, the emergence of a new, highly 

susceptible population, and the recent appearance of drug resistant strains point to the need for new and better 
prophylactic and therapeutic treatments of mycobacterial diseases. 

B. TUBERCULOSIS AND THE IMMUNE SYSTEM 

Infection with M. tuberculosis can take on many manifestations. The growth in the body of M. 

30 tuberculosis and the pathology that it induces is largely dependent on the type and vigor of the immune response. 
From mouse genetic studies it is known that innate properties of the macrophage play a large role in containing 
disease (1). Initial control of M. tuberculosis may also be influenced by reactive y5 T cells. However, the major 
immune response responsible for containment of M. tuberculosis is via helper T cells (Thl) and to a lesser extent 
cytotoxic T cells (2). Evidence suggests that there is very little role for the humoral response. The ratio of 

35 responding Thl to Th2 cells has been proposed to be involved in the phenomenon of suppression. 

Thl cells are thought to convey protection by responding to M. tuberculosis T cell epitopes and secretine 
cytokines, particularly interferon-7. which stimulate macrophages to kill M. tuberculosis. While such an immune 
response normally clears infections by many facultative intracellular pathogens, such as Salmonella, Listeria or 
Francisetta, it is only able to contain the growth of other pathogens such as M. tuberculosis and Toxoplasma. 

40 Hence, it is likely that M. tuberculosis has the ability to suppress a clearing immune response, and mycobacterial 
components such as lipoarabinomannan are thought to be potential agents of this suppression. Dormant M. 
tuberculosis can remain in the body for long periods of time and can emerge to cause disease when the immune 
system wanes due to age or other effects such as infection with HIV-1. 
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Historically it has been thought that one needs replicating Mycobacteria in order to effect a protective 
immunization. An hypothesis explaining the molecular basis for the effectiveness of replicating mycobacteria in 
inducing protective immunity has been proposed by Orme and co-workers (3). These scientists suggest that 
antigens are pinocytosed from the mycobacterial-laden phagosome and used in antigen presentation. This 
hypothesis also explains the basis for secreted proteins effecting a protective immune response. 

Antigens that stimulate T cells from M. tuberculosis infected mice or from PPD-positive humans are found 
in both the whole mycobacterial cells and also in the culture supernatant* (3, 4, 5-7, 34). Recently Pal and 
Horwitz (8) were able to induce partial protection in guinea pigs by vaccinating with M. tuberculosis supernatant 
fluids. Similar results were found by Andersen using a murine model of tuberculosis (9). Other studies include 
reference nos. 34, 12. Although these works are far from definitive they do strengthen the notion that protective 
epitopes can be found among secreted proteins and that a non-living vaccine can protect against tuberculosis. 

For the purposes of vaccine development one needs to find epitopes that confer protection but do not 
contribute to pathology. An ideal vaccine would contain a cocktail of T-cell epitopes that preferentially stimulate 
Thl cells and are bound by different MHC haplotypes. Although such vaccines have never been made there is at 
least one example of a synthetic T-cell epitope inducing protection against an intracellular pathogen (10). It is an 
object of this invention to provide Af tuberculosis DNA sequences that encode bacterial peptides having an 
immunostimulatory activity. Such immunostimulatory peptides will be useful in the treatment, diagnosis and 
prevention of tuberculosis. 
II. SUMMARY OF THE INVENTION 

The present invention provides DNA sequences isolated from Mycobacterium tuberculosis. Peptides 
encoded by these DNA sequences are shown to stimulate the production of the macrophage-stimulating cytokine, 
gamma interferon ("INF-?"), in mice. Critically, the production of INF-7 by CD4 cells in mice has been shown to 
correlate with maximum expression of protective immunity against tuberculosis (11). Furthermore, in human 
patients with active "minimal- or "contained" tuberculosis, it appears that the containment of the disease may be 
25 attributable, at least in pan, to the production of CD4 Th-l-like lymphocytes that release INF-7 (12). 

Hence, the DNA sequences provided by this invention encode peptides that are capable of stimulating 
T-cells to produce INF-7. That is. these peptides act as epitopes for CD4 T-cells in the immune system. Studies 
have demonstrated that peptides isolated from an infectious agent and which are shown to be T-cell epitopes can 
protect against the disease caused by that agent when administered as a vaccine (13, 10). For example. T-cell 
epitopes from the parasite Leishmania major have been shown to be effective when administered as a vaccine (10, 
13-14). Therefore, the immunostimulatory peptides (T-cell epitopes) encoded by the disclosed DNA sequences 
may be used, in purified form, as a vaccine against tuberculosis. 

As noted, the nucleotide sequences of the present invention encode immunostimulatory peptides. In a 
number of instances, these nucleotide sequences are only a part of a larger open reading frame (ORF) of an 
M tuberculosis operon. The present invention enables the cloning of the complete ORF using standard molecular 
biology techniques, based on the nucleotide sequences provided herein. Thus, the present invention encompasses 
both the nucleotide sequences disclosed herein and the complete M. tuberculosis ORFs to which they correspond. 
However, it is noted that since each of the nucleotide sequences disclosed herein encodes an immunostimulatory 
peptide, the use of larger peptides encoded by the complete ORFs is not necessary for the practice of the invention. 
Indeed, it is anticipated that, in some instances, proteins encoded by the corresponding ORFs may be less 
immunostimulatory than the peptides encoded by the nucleotide sequences provided herein. 

One aspect of the present invention is an immunostimulatory preparation comprising at least one peptide 
encoded by the DNA sequences presented herein. Such a preparation may include the purified peptide or peptides 
and one or more pharmaceutically acceptable adjuvants, diluents and/or excipients. Another aspect of the 
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invention is a vaccine comprising one or more peptides encoded by nucleotide sequences provided herein. This 
vaccine may also include one or more pharmaceutical^ acceptable excipiems, adjuvants and/or diluents. 

Another aspect of the present invention is an antibody specific for an immunostimulatory peptide encoded 
by a nucleotide sequence of the present invention. Such antibodies may be used to detect the present of M. 
5 tuberculosis antigens in medical specimens, such as blood or sputum. Thus, these antigens may be used to 
diagnose tuberculosis infections. 

The present invention also encompasses the diagnostic use of purified peptides encoded by the nucleotide 
sequences of the present invention. Thus, the peptides may be used in a diagnostic assay to detect the presence of 
antibodies in a medical specimen, which antibodies bind to the M. tuberculosis peptide and indicate that the subject 
10 from which the specimen was removed was previously exposed to M. tuberculosis. 

The present invention also provides an improved method of performing the tuberculin skin test to diagnose 
exposure of an individual to M. tuberculosis. In this improved skin test, purified immunostimulatory peptides 
encoded by the nucleotide sequences of this invention are employed. Preferably, this skin test is performed with 
one set of the immunostimulatory peptides, while another set of the immunostimulatory peptides is used to 
15 formulate vaccine preparations. In this way, the tuberculin skin test will be useful in distinguishing between 
subjects infected with tuberculosis and subjects who have simply been vaccinated. In this manner, the present 
invention may overcome a serious limitation inherent in the present BCG vaccine/tubercuiin skin test combination. 

Other aspects of the present invention include the use of probes and primers derived from the nucleotide 
sequences disclosed herein to detect the presence of M. tuberculosis nucleic acids in medical specimens. 

A further aspect of the present invention is the discovery that a significant proportion of the 
immunostimulatory peptides are homologous to proteins known to be located in bacterial cell surface membranes. 
This discovery suggests that membrane-bound peptides, particularly those from M. tuberculosis, may be a new 
source of antigens for use in vaccine preparations. 

III. BRIEF DESCRIPTION OF THE DRAWINGS 

25 Fig- \ shows the deduced amino acid sequence of the full length MTB2-92 protein. 

Fig. 2 shows an SDS polyacrylamide gel (12%) representing the different stages of the purification of 
MTB2-92 Lane 1:- Molecular weight markers (high range, GIBCO-BRL. Grand Island. NY. U.S.A.); Lane 2:- 
the IPTG induced crude bacterial lysate of E. coli JM109 containing pMAL-MTB2-92; Lane 3:- Uninduced crude 
bacterial lysate of £. coli JM109 containing pMAL-MTB2-92; Lane 4:- Eiuate from the amylose-resin column 

30 containing the MBP-MTB2-92 fusion protein; Lane 5;- Eiuate shown in previous lane after cutting with protease 
Factor Xa; Lane 6> Eiuate from the Ni-NTA column, containing MTB2-92. 

IV. DESCRIPTION OF THE INVENTION 

A. DEFINITIONS 

Particular terms and phrases used herein have the meanings set forth below. 

35 "Isolated". An "isolated" nucleic acid has been substantially separated or purified away from other nucleic 

acid sequences in the cell of the organism in which the nucleic acid naturally occurs, i.e.. other chromosomal and 
extrachromosomal DNA and RNA. The term "isolated" thus encompasses nucleic acids purified by standard 
nucleic acid purification methods. The term also embraces nucleic acids prepared by recombinant expression in a 
host cell as well as chemically synthesized nucleic acids. 

40 The nucleic acids of the present invention comprise at least a minimum length able to hybridize specifically 

with a target nucleic acid (or a sequence complementary thereto) under stringent conditions as defined below. The 
length of a nucleic acid of the present invention is preferably 15 nucleotides or greater in lencth. although a shorter 
nucleic acid may be employed as a probe or primer if it is shown to specifically hybridize under stringent 
conditions with a target nucleic acid by methods well known in the an. 
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"Probes" and "primers". Nucleic acid probes and primers may readily be prepared based on ihe nucleic 
acid sequences provided by this invention. A "probe" comprises an isolated nucleic acid attached to a detectable 
label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and 
enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are 
5 discussed, e.g., in reference nos. 15 and 16. 

"Primers" are short nucleic acids, preferably DNA oligonucleotides 15 nucleotides or more in length, 
which are annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between 
the primer and the target DNA strand, then extended along the target DNA strand by a DNA polymerase enzyme. 
Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction 
10 (PCR) or other nucleic-acid amplification methods known in the an. 

As noted, probes and primers are preferably 15 nucleotides or more in length, but. to enhance specificity, 
probes and primers of 20 or more nucleotides may be preferred. 

Methods for preparing and using probes and primers are described, for example, in reference nos. 15. 16 
and 17. PCR primer pairs can be derived from a known sequence, for example, by using computer proerams 
15 intended for that purpose such as Primer (Version 0.5, *> 1991. Whitehead Institute for Biomedical Research. 
Cambridge. MA). 

"Substantia] similarity". A first nucleic acid is "substantially similar" to a second nucleic acid if, when 
optimally aligned (with appropriate nucleotide insertions or deletions) with the other nucleic acid (or its 
complementary strand), there is nucleotide sequence identity in at least about 75%-90% of the nucleotide bases, 
20 and preferably greater than 90% of the nucleotide bases. ("Substantial sequence complementarity" requires a 

similar degree of sequence complementarity.) Sequence similarity can be determined by comparing the nucleotide 
sequences of two nucleic acids using sequence analysis software such as the Sequence Analysis Software Package 
of the Genetics Computer Group. University of Wisconsin Biotechnology Center. Madison, WI). 

"Operably linked". A first nucleic acid sequence is "operably" linked with a second nucleic acid sequence 
when the first nucleic acid sequence is placed in a functional relationship with the nucleic acid sequence. For 
instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression 
of the coding sequence. Generally, operably linked DNA sequences are contiguous and. where necessary to join 
two protein coding regions, in the same reading frame. 

"Recombinant". A "recombinant" nucleic acid is one that has a sequence that is not naturally occurring or 
has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This 
artificial combination is often accomplished by chemical synthesis or. more commonly, by the artificial 
manipulation of isolated segments of nucleic acids, e.g.. by genetic engineering techniques. 

"Stringent Conditions" and "Specific". The nucleic acid probes and primers of the present invention 
hybridize under stringent conditions to a target DNA sequence, e.g.. to a full length Mycobacterium tuberculosis 
35 gene that encodes an immunostimulatory peptide. 

The term "stringent conditions" is functionally defined with regard to the hybridization of a nucleic-acid 
probe to a target nucleic acid (i.e.. to a particular nucleic acid sequence of interest) by the hybridization procedure 
discussed in Sambrook et al. (1989) (reference no. 15) at 9.52-9.55. See also, reference no. 15 at 9.47-9.52. 
9.56-9.58: reference no. 18 and reference no. 19. 
40 Nucleic-acid hybridization is affected by such conditions as salt concentration, temperature, or organic 

solvents, in addition to the base composition. length of the complementary strands, and the number of nucleotide- 
base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the an. 

In preferred embodiments of the present invention, stringent conditions are those under which DNA 
molecules with more than 25% sequence variation (also termed "mismatch") will not hybridize. Such conditions 
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are also referred to as conditions of 75% stringency (since hybridization will occur only between molecules with 
75% sequence identity or greater). In more preferred embodiments, stringent conditions are those under which 
DNA molecules with more than 15% mismatch will not hybridize (conditions of 85% stringency). In most 
preferred embodiments, stringent conditions are those under which DNA molecules with more that 10% mismatch 
5 will not hybridize (i.e. conditions of 90% stringency). 

When referring to a probe or primer, the term "specific for (a target sequence)" indicates that the probe or 
primer hybridizes under stringent conditions substantially only to the target sequence in a given sample comprising 
the target sequence. 

"Purified" - a "purified" peptide is a peptide that has been extracted from the cellular environment and 
10 separated from substantially all other cellular peptides. As used herein, the term peptide includes peptides, 

polypeptides and proteins. In preferred embodiments, a "purified" peptide is a preparation in which the subject 
peptide comprises 80% or more of the protein content of the preparation. For certain uses, such as vaccine 
preparations, even greater purity may be necessary. 

"Immunostimulatory" - the phrase "immunostimulatory peptide" as used herein refers to a peptide that is 
15 capable of stimulating INF? production in the assay described in section B 5 below. In preferred embodiments, 
an immunostimulatory peptide is one capable of inducing greater than twice the background level of this assav 
determined using T-cells stimulated with no antigens or negative control antigens. Preferably, the 
immunostimulatory peptides are capable of inducing more than 0*01 ng/ml of INF-7 in this assay system. In 
more preferred embodiments, an immunostimulatory peptide is one capable of inducing greater than 10 ng/ml of 
20 INF-7 in this assay system. 

B. MATERIALS AND METHODS 

1. STANDARD METHODOLOGIES 
The present invention utilizes standard laboratory practices for the cloning, manipulation and sequencing of 
nucleic acids, purification and analysis of proteins and other molecular biological and biochemical techniques. 
25 unless otherwise stipulated. Such techniques are explained in detail in standard laboratory manuals such as 
Sambrook et al. (15); and Ausubel et al. (16). 

Methods for chemical synthesis of nucleic acids are discussed, for example, in reference nos. 20 and 21. 
Chemical synthesis of nucleic acids can be performed, for example, on commercial automated oligonucleotide 
synthesizers. 

30 2. ISOLATION OF MYCOBACTERJUMTUBERCULOSIS DNA SEQUENCES 

ENCODING IMMUNOSTIMULATORY PROTEINS 

Mycobacterium tuberculosis DNA was obtained by the method of Jacobs et al. (22). Samples of 
the isolated DNA were partially digested with one of the following restriction enzymes HinPh HpaXl. Aci\, Taql, 
BsaRl. Narl. Digested fragments of 0-2-5kb were purified from agarose gels and then ligated into the BstBl site 

35 in front of the truncated phoA gene in one or more of the three phagemid vectors pJDTl. pJDT2, and JDT3. 

A schematic representation of the phagemid vector pJDT2 is provided in Mdluli et al. (1995) (reference 
no. 31). The pJDT vectors were specifically designed for cloning and selecting genes encoding cell wall- 
associated, cytoplasmic membrane associated, periplasmic or secreted proteins (and especially for cloning such 
genes from GC rich genomes, such as the Mycobacterium tuberculosis genome). The vectors have a BstBl cloning 

40 site in frame with the bacterial alkaline phosphatase gene (phoA) such that cloning of an in-frame sequence into the 
cloning site will result in the production of a fusion protein. The phoA gene encodes a version of the alkaline 
phosphatase that lacks a signal sequence; hence, only if the DNA cloned into the BstBl site includes a signal 
sequence or a transmembrane sequence can the fusion protein be secreted to the medium or inserted into 
cytoplasmic membrane, periplasm or cell wall. Those clones encoding such fusion proteins may be detected by 
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plating clones on agar plates containing the indicator 5-bromo-4-chloro-3-indolyl-phosphate. Alkaline phosphatase 
converts this indicator to a blue colored product. Hence, those clones containing secreted alkaline phosphatase 
fusion proteins will produce the blue color. 

The three vectors in this series (pJDTl, 2 and 3) have the BstBl restriction sites located in different reading 
frames with respect to the phoA gene. This increases the likelihood of cloning any particular gene in the correct 
orientation and reading frame for expression by a factor of 3. Reference no. 31 describes pJDT vectors in detail. 

3. SELECTION OF SECRETED FUSION PROTEINS 
The recombinant clones described above were transformed into E. coli and plated on agar plates containing 

the indicator 5-bromo-4-chloro-3-indolyl-phosphate. Production of blue pigmentation, produced as a result of the 
action of alkaline phosphatase on the indicator, indicated the presence of secreted cytoplasmic membrane 
periplasmic, cell wall associated or outer membrane fusion proteins (because the bacterial alkaline phosphatase 
gene in the vector lacks a signal sequence and could not otherwise escape the bacterial cell). A similar technique 
has been used to identify M. tuberculosis genes encoding exported proteins by Lim et al. (32). 

Those clones producing blue pigmentation were picked and grown in liquid culture to facilitate the 
15 purification of the alkaline phosphatase fusion proteins. These recombinant clones were designated according to 
the restriction enzyme used to digest the Mycobacterium tuberculosis DNA (thus, clones designated A#2-l, A#2-2 
etc were produced using Mycobacterium tuberculosis DNA digested with Aci\). 

4. PURIFICATION OF SECRETED FUSION PROTEINS 

PhoA fusion proteins were extracted from the selected £. coli clones by cell lysis and purified by SDS 

20 polyacrylamide gel electrophoresis. Essentially, individual E. coli clones are grown overnight at 30°C with shaking 
in 2 ml LB broth containing ampicillin, kanamycin and IPTG. The cells are precipitated by centrifugation and 
resuspended in 100 uL Tris -EDTA buffer. 100 ?L lysis buffer (1% SDS, ImMEDTA, 25mM DTT, 10% 
glycerol and 50 mM tris-HCI, pH 7.5) is added to this mixture and DNA released from the cells is sheared by 
passing the mixture through a small gauge syringe needle. The sample is then heated for 5 minutes at 100°C and 

25 loaded onto an SDS PAGE gel (12 cm x 14 cm x 1.5 mm, made with 4% (w/v) acrylamide in the stacking section 
and 10% (w/v) acrylamide in the separating section). Several samples from each clone are loaded onto each gel. 

The samples are elect rophoresed by application of 200 volts to the gel for 4 hours. Subsequently, the 
proteins are transferred to a nitrocellulose membrane by Western blotting. A strip of nitrocellulose is cut off to be 
processed with antibody, and the remainder of the nitrocellulose is set aside for eventual elution of the protein. 

30 The strip is incubated with blocking buffer and then with anti-alkaline phosphatase primary antibody, followed by 
incubation with anti-mouse antibody conjugated with horse radish peroxidase. Finally, the strip is developed with 
the NEN DuPont Renaissance kit to generate a luminescent signal. The migratory position of the PhoA fusion 
protein, as indicated by the luminescent label, is measured with a ruler, and the corresponding recion of the 
undeveloped nitrocellulose blot is excised. 

1$ This region of nitrocellulose, which contains the PhoA fusion protein, is then incubated in 1 ml 20% 

acetronitrile at 37°C for 3 hours. Subsequently, the mixture is centrifuged to remove the nitrocellulose and the 
liquid is transferred to a new test tube and iyophilized. The resulting protein pellet is dissolved in 100 ^L of 
endotoxin-free, sterile water and precipitated with acetone at -20°C. After centrifugation the bulk of the acetone is 
removed and the residual acetone is allowed to evaporate. The protein pellet is re -dissolved in 100 ^L of sterile 

40 phosphate buffered saline. This procedure can be scaled up by modification to include IPTG induction 2 hours 
prior to cell harvesting, washing nitrocellulose membranes with PBS prior to acetonitrile extraction and 
lyophilization of acetonitrile extracted and acetone precipitated protein samples. 
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5. DETERMINATION OF IMMUNOSTIMULATORY CAPACITY IN MICE 
The purified alkaline phosphatase - Mycobacterium tuberculosis fusion peptides encoded by the 

recombinant clones were then tested for their ability to stimulate INF-7 production in mice. The test used to 
determine INF-7 stimulation is as essentially that described by Orme et al. (11). 
5 Essentially, the assay method is as follows: The virulent strain M. tuberculosis Erdman is grown in 

Proskauer Beck medium to mid-log phase, then aliquoted and frozen at -70°C for use as an inoculant. Cultures of 
this bacterium are grown and harvested and mice are inoculated with 1 x 10 5 viable bacteria" suspended in 200 u\ 
sterile saline via a lateral tail vein on day one of the test. 

Bone marrow-derived macrophages are used in the test to present the bacterial alkaline phosphatase- 
10 Mycobacterium tuberculosis fusion protein antigens. These macrophages are obtained by harvesting cells from 
mouse femurs and culturine the cells in Dulbecco's modified Eagle medium as described by Orme et al. (11). 
Eight to ten days later, up to ten /ig of the fusion peptide to be tested is added to the macrophages and the cells are 
incubated for 24 hours. 

The CD4 cells are obtained by harvesting spleen cells from the infected mice and then pooling and 
15 enriching for CD4 cells by removal of adherent cells by incubation on plastic Petri dishes, followed by incubation 
for 60 minutes at 37°C with a mixture of Jl ld.2, Lyt-2.43. and GL4 monoclonal antibody (mAb) in the. presence 
of rabbit complement to deplete B cells and immature T cells, CD8 cells, and yS ceils, respectively. The 
macrophages are overlaid with 10* of these CD4 cells and the medium is supplemented with 5 U IL-2 to promote 
continued T cell proliferation and cytokine secretion. After 72 hours, cell supernatants are harvested from sets of 
20 triplicate wells and assayed for cytokine content. 

Cytokine levels in harvested supernatants are assayed by sandwich ELISA as described by Orme et al. 

(11). 

6. DETERMINATION OF IMMUNOSTIMULATORY CAPACITY IN HUMANS 
The purified alkaline phosphatase - Mycobacterium tuberculosis fusion peptides encoded by the 

25 recombinant clones or by synthetic peptides are tested for their ability to induce INF-y production by human T 
cells in the following manner. 

Blood from tuberculin positive people (producing a tuberculin positive skin test) is collected in EDTA 
coated tubes, to prevent clotting. Mononuclear cells are isolated using a modified version of the separation 
procedure provided with the NycoPrep T " 1.077 solution (Nycomed Pharma AS. Oslo, Norway). Briefly, the blood 

30 is diluted in an equal volume of a physiologic solution, such as Hanks Balanced Salt solution (HBSS). and then 

gently layered over top of the Nycoprep solution in a 2 to 1 ratio in 50 ml tubes. The tubes are centrifuced at 800 
x g for 20 minutes and the mononuclear cells are then removed from the interface between the Nycoprep solution 
and the sample layer. The plasma is removed from the top of the tube and filtered through a 0.2 micron filter and 
is then added to the tissue culture media. The mononuclear cells are washed twice: the cells are diluted in a 

35 physiologic solution, such as HBSS or RPMI 1640. and centrifuged at 400 x g for 10 minutes. The mononuclear 
cells are then resuspended to the desired concentration in tissue culture media (RPMI 1640 containing 10% 
autologous serum, Hepes. non-essential amino acids, antibiotics and polymixin B). The mononuclear cells are then 
cultured in 96 well microiiire plates. 

Peptides or PhoA fusion proteins are then added to individual wells in the 96 well plate, and cells are then 

40 placed in an incubator (37°C. 5% QQ,). Samples of the supernatants (tissue culture media from the wells 

containing the cells) are collected at various time points (from 3 to 8 days) after the addition of the peptides or 
PhoA fusion proteins. The immune responsiveness of T cells to the peptides and PhoA fusion proteins is assessed 
by measuring the production of cytokines (including gamma-interferon). 
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Cytokines are measured using an Enzyme Linked Immunosorbent Assay (ELISA). the details of which are 
described in the Cytokine ELISA Protocol in the PharMingen catalogue (PharMingen. San Diego, California). For 
measuring for the presence of human gamma-interferon. wells of a 96 well microtitre plate are coated with a 
capture antibody (ami-human gamma-interferon antibody). The sample supematants are then added to individual 
5 wells. Any gamma-interferon present in the sample will bind to the capture antibody. The wells are then washed. 
A detection antibody (anti-human gamma-interferon antibody), conjugated to biotin, is added to each well, and will 
bind to any gamma-interferon that is bound to the capture antibody. Any unbound detection" antibody is washed 
away. An avidin peroxidase enzyme is added to each well (avidin binds tightly to the biotin on the detection 
antibody). Any excess unbound enzyme is washed away. Finally, a chromogenic substrate for the enzyme is 
10 added and the intensity of the colour reaction that occurs is quantitated using an ELISA plate reader. The quantity 
of the gamma-interferon in the sample supematants is determined by comparison with a standard curve using 
known quantities of human gamma-interferon. 

Measurement of other cytokines, such as InterIeukin-2 and Interleukin-4. can be determined using the same 
protocol, with the appropriate substitution of reagents (monoclonal antibodies and standards). 
15 7. DNA SEQUENCING 

The sequencing of the alkaline phosphatase fusion clones was undertaken using the AmpIiCycle thermal 
sequencing kit (Perkin Elmer. Applied Biosystems Division. 850 Lincoln Centre Drive, Foster City, CA 94404. 
U.S.A.), using a primer designed to read out of the alkaline phosphatase gene into the Mycobacterium tuberculosis 
DNA insert, or primers specific to the cloned sequences. 
20 C. RESULTS 

1. IMMUNOSTIMULATORY CAPACITY 
More than 300 fusion clones were tested for their ability to stimulate INF-7 production. Of these. 80 
clones were initially designated to have some ability to stimulate INF-7 production, fables 1 and 2 show the data 
obtained for these 80 clones. Clones placed in Table 1 showed the greatest ability to stimulate INF-7 production 
25 (greater than 10 ng/ml of INF-7) while clones placed in Table 2 stimulated the production of between 2 ng/ml and 
10 ng/ml of INF-7. Background levels of INF-7 production (i.e., levels produced without any added M. 
tuberculosis antigen) were subtracted from the levels produced by the fusions to obtain the figures shown in these 
tables. 



30 TABLE 1 



Irnmunostimulatory AP-fusion clones 



No. 


Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


1 


AciWI-152 


> 40,000 


-65,000 


-23.400 


-633 


M. avium 

acetolactate synthase 
(98 + ) 


2 


Acil# 1-247 


> 40,000 


-160,000 


-118,400 


-3.198 


peptide synthetase 
(153) 


3 


AciI#l-264 


> 40,000 


-72,500 


-30,900 


-833 


nothing evident 


4 


Aciltf 1-435 


> 40,000 


-80,000 


-38,400 


- 1 ,038 


M. smegmatis 
ethambutol 
resistance gene 
EmbA (624) 
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TABLE 1 



Immunostimulatory AP-fusion clones 



No. 


Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


5 


HinP#l-27 


> 20,000 


59,000 


17,400 


471 


nothing evident 


6 


HinP#2-92 


> 20,000 


74,600 


33,000 


891 


1. M. tuberculosis 
ORF 

MTCY190.11C 
(1794 + ) 

2. Cytochrome C 
oxidase subunit II 
(141) 


7 


HinP#2-145 


> 20,000 


60,000 


13,900 


375' 


nothing evident 


8 


HinP#2-150 


> 20,000 


55,000 


13,400 


362 


nothing evident 


9 


HinP# 1-200 


> 20,000 


53,500 


11,900 


321 


nothing evident 


10 


HinP#3-30 


> 20,000 


69,000 


27,400 


740 


M. leprae 
chromosome 
sequence in B983 
region (28 1 + ) 


11 


AciI#2-2 


> 20,000 


70,000 


28,400 


768 


M. leprae 
chromosome 
sequence within 
region B1529 (139) 


12 


AciI#2-23 


> 20,000 


75,000 


33,400 


903 


Region within 
sequence MLHKJUy 
of the M. leprae 
chromosome 


13 


AciI#2-506 


>20 000 


60 000 


18,400 


70 


nouiing cviueni 


14 


Acil#2-511 


> 20,000 


-60,000 


-18,400 


-498 


nothing evident 


15 


AciI#2-639 


> 20,000 


-60,000 


-18,400 


-498 


nothing evident 


16 


AciI#2-822 


> 20,000 


-45,000 


-3,400 


-93 


M. tuberculosis 
sequence within 
region MD0074 
(U27357) (55 P) 


17 


Acil#2-823 


> 20.000 


-46,500 


-4,900 


-132 


nothing evident 


18 


AciI#2-825 


> 20,000 


-150,000 


-110,000 


-2,970 


M. tuberculosis 
sequence 

MTCY3 1.03c (431) 


19 


AciI#2-827 


> 20,000 


-48,000 


-6,400 


-174 


cytochrome d 
oxidase 


20 


AciI#2-898 


> 20,000 


-49,000 


-7.400 


-201 


nothing evident 
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TABLE 1 



Immunostimuiatory AP-fusion clones 



No. 


Name 


INF 


Fus-MW 


TBpon 


coding 


Similarity (score) 


21 


Acil#2-1084 






-~ 1 1 Ann 


— o4y 


Sequences within 
M. 'tuberculosis 
clone X68281 (96 + ) 
and M . leprae clone 
B983 (122*) 


22 


AciI#3-47 


> 20,000 


-55,000 


-13,400 


-363 


nothing evident 


23 


Acil#3-133 


> 20,000 


-55,000 


-13,400 


-363 


nothing evident 


24 


Acil#3-166 


> 20,000 


-48,000 


-6,400 


-174 


nothing evident 


25 


Acil#3-167 


> 20,000 


-65,000 


-23,400 


-633 


M. leprae DNA 
sequence within 
region B983 (588 + ) 


26 


AciI#3-206 


> 20,000 


-65,000 


-23,400 


-633 


M. leprae DNA 
sequence within 
chromosome region 
MD0092 (91) 


27 


HinP#l-31 


14,638 


-46,000 


-4,400 


-120 


M, tuberculosis 19 
kDa lino-nrntpin 
antigen precursor 
(218) 


28 


HinP# 1-144 


13,546 


-70,000 


-23,900 


-645 


M. leprae DNA 
sequence within 
chromosome region 
B983 (78) 


29 


HinP#l-3 


11,550 


-49,000 


-7,400 


-200 


M. leprae DNA 
sequence within 
chromosome region 
B983 (100 + ) 


30 


AciI#l-486 


11,416 


-45.000 


-3.400 


-93 


nothing known 


31 


Aciltf 1-426 


11,135 


-47,500 


-5,900 


-160 


Dipeptide transport 
protein (65) 


32 


Acil#2-916 


10,865 


-75,000 


-33,400. 


-903 


nothing evident 


Abbreviations: INF: pg/ml of INF-7 produced using fusion to stimulate immune T-cells. Fus. MW: Relative 
molecular weight of the fusion protein in Da. TB port.: Estimated amount of fusion attributable to the M. 
tuberculosis protein. Coding: Amount of DNA needed to encode TB portion of fusion proteins tin base 
pairs). Similarity: Amino acid sequence similarity seen by analysis of DNA via the BLASTX or TBLASTX* 
programs. Scores for alignments are indicated in (). Due to the high G + C nature of M. TB DNA many false 
positives are evident. Only scores above 100 have good credibility. 
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TABLE 2 

Immunostimulatory AP-fusion clones (cont'd) 

5 



10 



15 



20 



25 



No. 


Clone 
Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


i 
i 


ACUtf 1 -oz 




— 4 J,UOU 


- 1,400 


.-39 


M, tuberculosis MTCY 
190. 11C cytochrome C 
oxidase subunit II 
(198) 

M. leprae sequence in 
B1551 region (1087 + ) 


2 


Acil#2-14 


6,907 


-45,000 


-3,400 


-93 


nothing evident 


3 


AciI#2-26 


3,089 


-72,000 


-30,400 


-822 


nothing evident 


4 


AciI#2-35 


3,907 


-45,000 


-3,400 


-93 


Possibly similar to M. 
leprae sequence in the 
B983 region (116 + ) 


5 


Acil#2-147 


5,464 








nothing evident 


6 


AciI#2-508 


7,052 


-70,000 


— 28,400 


— / uo 


Similar to sequence of 
the M. leprae ORF 
encoding gp U00018 
(125) and similar to 
sequence in the B2168 

region or ivi. 
leprae genome (225 + ) 


7 


Acil#2-510 


2,445 


-69,000 


-27,400 


-741 


nothing evident 


8 


AciI#2-523 


2,479 


-50,000 


-8,400 


-228 


Similar tn h/f 
tuberculosis sequence 
z70692 from clone 
Y427 (96) 


9 


AciI#2-676 


3,651 


-70,000 


-28,400 


-768 


Similar to AciI#2-639 


10 


AciI#-2-834 


5,942 


-60,000 


-13.900 


-375 


nothing evident 


11 


AciLSQ-854 


5,560 


-44,000 


-2,400 


-66 


nothing evident 


12 


AciW2-872 


2,361 


-47,000 


-5,400 


-147 


nothing evident 


13 


AciI#2-874 


2,171 


-45,000 


-3,400 


-93 


nothing evident 


14 


AciW2-884l 


2,729 


-85,000 


-43,400 


-1173 


Isocitrate 

dehydrogenase (247) 


15 


AciI#2-894 


3,396 


-70,000 


-28,400 


-768 


nothing evident 


16 


Acil#2-1014 


6,302 


-45,000 


-3,400 


-93 


nothing evident 


17 


Acil#2-1018 


4,642 


-55,000 


-13,400 


-363 


nothing evident 


18 


Acil#2-1025 


3.582 


-45,000 


-3,400 


-93 


nothing evident 


19 


AciW2-l034 


2,736 


-80,000 


-38.400 


-103 


nothing evident 
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TABLE 2 



Immunostimulatory AP-fusion clones (cont'd) 



10 



No. 


Clone 
Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


20 


Acil#2-1035 


3,454 


-46,000 


-4,400 


-120 


nothing evident 


21 


Acil#2-1089 


8,974 


-65,000 


-23,400 


-633 


Similar to Af. 
tuberculosis sequence 
X75361 and sequence 
in M. bovis MD0057 
and U34849 regions. 
Immunogenic proteins 
MPB64 and MPT64 
are homologous; 


22 


Acil#2-1090 


7,449 


-65,000 


-23,400 


-633 


nothing evident 


23 


Acil#2-1104 


5,148 


-68,000 


-26,400 


-714 


Similar to M. 
tuberculosis sequence 
X80268 and to cds 1 

• — ijg i 

(zjo) in M. leprae 
sequence region 
MD0045 (169 + ); 
secreted antigenic 
protein. 


24 


AciI#3-9 


3,160 


-67,000 


-25,400 


-687 


nothing evident 


ZD 


Acil#3-12 


3,891 


-75,000 


-33,400 


-903 


Penicillin binding 
protein; similar to M. 
leprae sequence within 


26 


Acil#3-15 


4,019 


-65,000 


-23,400 


-633 


nothing evident 


27 


Acil#3-21 


2,301 


-69,000 


-27,400 


-741 


nothing evident 


28 


AciI3-78 


2,905 


-65,000 


-23,400 


-633 


Similar to sequence 
within M. leprae 
genomic clone B983 


oo 
zv 


Acil#3-134 


3,895 


-45,000 


-3,400 


-93 


nothing evident 


30 


AciI#3-204 


4,774 


-60.000 


-13,900 


-375 


nothing evident 


31 


Acil#3-214 


7,333 


-50.000 


8,400 


-228 


nothing evident 


32 


AciI#3-243 


2,857 


-65,000 


-23,400 


-633 


nothing evident 


33 


AciL?3-28i 


2,943 


-65,000 


-23,400 


-633 


Similar to sequence 
within M. leprae 
genomic clone B983 


34 


Bsa HW1-21 


8,122 


-90,000 


-48,400 


- 1 ,209 


nothing evident 


35 


HinP#l-12 


2,905 


-66,000 


-24,400 


-660 


possible tyrosine 
phosphatase 
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TABLE 2 



Immunostimulatory AP-fusion clones (cont'd) 



No. 


Clone 
Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


36 


HtnP#2-23 


2,339 


-43,000 


-1,400 


-39 


Similar to sequence in 
M. leprae genomic 
clone MD0009-0-(B13) 
(354) 


37 


HinP# 1-142 


6,258 


-69,000 


-27,400 


-741 


nothing evident 


38 


HinP#2-4 


6,567 


-66,000 


-24,400 


-660 


nothing evident 


39 


HinP#2-143 


3,689 


-65,000 


-23,400 


-633 


Similar to sequence in 
M. leprae genomic 
clone B1529 


40 


HinP#2-145A 


2,314 


-64,000 


-22,400 


-606 


nothing evident 


41 


HinP#2-147 


7,021 


65,000 


23,400 


-633 


nothing evident 


42 


HinP#3-28 


2,980 


70,000 


28,400 


-768 


Similar to M. leprae 
sequence in genomic 
clones MD0085 and 
sequence for Af. leprae 
gp U00013 cds 27 of 
B1496 region 


43 


HinP#3-34 


. 2,564 


71,000 


29,400 


-795 


Similar to sequence in 
M. leprae genomic 
clone B2168 (U00018 
cds 9) 


44 


HinP#3-41 


3,296 


48,000 


6,400 


-1,728 


Similar to antigen 85 
complex protein 
subunit 


45 


HpaIWl-3 


2,360 


65,000 


23,400 


-633 


Cytochrome C oxidase 
subunit II (156) 
Similar to M. 
tuberculosis sequence 
on clone MTCY 
190.11c 
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TABLE 2 



Immunostimulatory AP-fusion clones (cont'd) 



5 



No. 


Clone 
Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


46 


HpaII#l-8 


2,048 


110,000 


68,400 


-1,848 


nothing evident 


47 


HpaIWl-10 


4,178 


55,000 


13,400 


-633 


Similar to 

immunogenic proteins 
MPB64/MPT64 


48 


HpaII#l-13 


3,714 


43,000 


1,400 


-39 


nothing evident 


Abbreviations: INF: pg/ml of INF-7 produced using fusion to stimulate immune T-cells. Fus. MW: Relative 
molecular weight of the fusion protein. TB port.: Estimated amount of fusion attributable to the M. 
tuberculosis protein. Coding: Amount of DNA needed to encode TB portion of fusion proteins. Similarity: 
Amino acid sequence similarity seen by analysis of DNA via the BLASTX or TBLASTX" programs. Scores 
for alignments are indicated in ()* Due to the high G + C nature of M. TB DNA many false positives are 
evident. Only scores above 100 have good credibility. 



2. DNA SEQUENCING AND DETERMINATION OF OPEN READING FRAMES 

DNA sequence data for the sequences of the Mycobacterium tuberculosis DNA present in the clones shown 
in Tables 1 and 2 are shown in the accompanying Sequence Listing. The sequences are believed to represent the 
coding strand of the Mycobacterium DNA. In most instances, these sequences represent only partial sequences of 
the immunostimulatory peptides and. in turn, only partial sequences of Mycobacterium tuberculosis genes. 
However, each of the clones from which these sequences were derived encodes, by itself, at least one 
immunostimulatory T-cell epitope. As discussed in pan V below, one of ordinary skill in the art will, given the 
information provided herein, readily be able to obtain the immunostimulatory peptides and corresponding full 
length M, tuberculosis genes using standard techniques. Accordingly, the nucleotide sequences of the present . 
invention encompass not only those sequences presented in the sequence listings, but also the complete nucleotide 
sequence encoding the immunostimulatory peptides as well as the corresponding M. tuberculosis genes. The 
nucleotide abbreviations employed in the sequence listings are as follows in Table 3: 
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TABLE 3 



10 



15 



20 



Symbol 


Meaning 


A 


A; adenine 


C 


C; cytosine 


G 


G; guanine 


T 


T; thymine 


U 


U; uracil 


M 


A or C 


R 


A or G 


W 


A or T/U 


S 


C or G 


Y 


C or T/U 


K 


G or T/U 


V 


A or C or G; not T/U 


H 


A or C or T/U; not G 


D 


A or G or T/U; not C 


B 


C or G or T/U; not A 


N 


(A or C or G or T/U) or (unknown or other or no 
base) 

indeterminate* 



25 



30 



The DNA sequences obtained were then analyzed with respect to the G + C content as a function of codon 
position over a window of 120 codons using the 'FRAME' computer program (Bibb. M.J.; Findlay, P.R.; and 
Johnson, M.W.; Gene 30: 157-166 (1984)). This program uses the bias of these nucleotides for each of the codon 
positions to enable the correct reading frame to be identified. 

3. IDENTIFICATION OF T CELL EPITOPES IN THE IMMUNOSTIIvfULATORY 
PEPTIDES 

The T-Site program, by Feller, D.C. and de la Cruz. V.F.. Medlmmune Inc., 19 Firstfield Rd., 
Gaithersburg. M.D. 20878. U.S.A.. was used to predict T-ce!l epitopes from the determined coding sequences. It 
uses a series of four predictive algorithms. In particular, peptides were designed against regions indicated by the 
algorithm "A" motif which predicted alpha-helical periodicity (Margalit. H.; Spouge. J.L.; Comette. J.L.; Cease. 
K.B.; DeLisi, C; and Berzofsky, J. A.. J. Immunol.. 138:2213 (1987)) and amphipamicity and those indicated by 
the algorithm "R" motif which identifies segments which display similarity to motifs known to be recognized by 
MHC class I and class II molecules (Rothbard. J.B. and Taylor. W.R.. EM BO J. 7:93 (1988)). The other two 
algorithms identify classes of T-cell epitopes recognized in mice. 

4. SYNTHESIS OF SYNTHETIC PEPTIDES CONTAINING T CELL EPITOPES IN 
IDENTIFIED IMMUNOSTI\OJLATORY PEPTIDES 

A series of staggered peptides were designed to overlap regions indicated by the T-site analysis. These 

were synthesized by Chiron Mimotopes Pty. Ltd. (11055 Rosclle St.. San Diego, CA 92121, U.S.A.). 
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10 



30 



60 



Peptides designed from sequences described in this application include: 

Hin P#l-200 (6 peptides) 

Peptide Sequence Peptide Name 

VHLATGMAETVAS FS PS HP 1 1 - 2 0 0 / 2 

REVVHLATGMAETVASF HPI1-200/3 

RDS RE WHLATGMAETV HP 11-200/4 

DFNRDS REWHLATGMA HPI1-200/5 

I S AAWTG YLRWTTPDR HPI1-200/6 

AWFLCAAAI SAAWTG HPI1-200/7 



AciI#2-827 (14 peptides) 

Peptide Sequence Peptide Name 

VTDNP AWYRLTKFFGKL CD -2/1/96/1 

15 AWYRLTKFFGKLFLINF CD- 2/1/96/2 

KFFGKLFLINFAIGVAT CD -2/1/96/3 

FL I NFAI GVATG I VQE F CD -2/1/96/4 

AIGVATGIVQEFQFGMN • CD- 2/1/96/ 5 

TG I VQE FE FGMNWS EYS CD -2/1/96/6 

20 EFQ FGMNWS EYS RFVGD CD- 2/1/96/7 

MNWS EYS RFVGD VFGAP CD- 2/1/96/8 

WS E YS R FVGDVFGAP LA CD-2/1/96/9 

E YSRFVGDVFGAPLAME CD -2/1/96/10 

S RFVGD VFGAP LAME S L CD -2/1/96/11 

25 WIFGWNRLPRLVHLACI CD-2/1/96/12 

WNRLPRLVHLAC I WI VA CD -2/1/96/13 

GRAELSSIWLLTNNTA CD- 2/1/96/14 



HinP#l-3 (2 peptides) 

Peptide Sequence Peptide Name 

GKTYDAYFTDAGGITPG HP II - 3/2 

YD AY FTD AGG I T PGNS V HP 1 1 - 3 / 3 

35 HinP#l-3 / HinP#l-200 combined peptides 

Peptide Sequences Peptide Name 

WPQGKTYDAYFTDAGG I (HinP#l-3) HPI1-3/1 (combined) 

ATGMAETVAS FSPSEGS ( HinP# 1 - 2 0 0 ) 

40 

AciI#2-823 (1 peptide) 

Peptide Sequence Peptide Name 

GWERRLRHAVS PKDPAQ AI 2-823/1 

45 

HinP#l-31 (4 peptides) 

Peptide Sequence Peptide Name 

TGSGETTTAAGTTASPG HPI1-31/1 

50 GAAILVAGLSGCSSNKS HPI 1-31/2 

AVAGAAI LVAGLSGCS S HPI1-31/3 

LTVAVAGAAILVAGLSG HP 11-31/4 

These synthetic peptides were resuspended in phosphate buffered saline to be tested to confirm their ability 
55 to function as T cell epitopes using the procedure described in pan IV(B)(6) above. 

5. CONFIRMATION OF IMMUNOSTIMULATORY CAPACITY USING T CELLS 
FROM TUBERCULOSIS PATIENTS 

The synthetic peptides described above, along with a number of the PhoA fusion proteins shown to be 

immunostimulaiory in mice were tested for their ability to stimulate gamma interferon production in T-cells from 

tuberculin positive people using the methods described in pan IV(B)(6) above. For each assay, 5 x 10 s 

mononuclear cells were stimulated with up to 1 jig/ml M. tuberculosis peptide or up to 50 ng/ml Phb A fusion 

protein. M. tuberculosis filtrate proteins. Con A and PHA were employed as positive controls. An assay was run 

with media alone to determine background levels, and Pho A protein was employed as a neeative control. 
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The results, shown in Table 4 below, indicate that all of the peptides tested stimulated gamma interferon 
production from T-cel!s of a particular subject. 



TABLE 4 



Peptide or Pho A 
Fusion Protein Name 


Concentration of 
Interferon-gamma 
(pg/ml) 


Concentration of 
Interferon-gamma 
minus background 
"(pg/ml) 


CD-2/ 1/96/1 


256.6 


1533 


CD-2/ 1/96/9 


187.6 


84.3 


CD-2/ 1/96/ 10 


134.0 


30.7 


CD-2/1/96/11 


141.6 


38.3 


CD-2/ 1/96/ 1 4 


310.2 


206.9 


HPI1-3/2 


136.3 


23.0 


HPI1-3/3 


264.2 


160.9 


Aci! 2-898 


134.0 


30.7 


Acil 3-47 


386.8 


283.5 


M. tuberculosis filtrate proteins 
(10 /ig/ml) 


256.6 


153.3 


Af. tuberculosis filtrate proteins (5 
Mg/ml) 


134.0 


30.7 


Con A (10 /ig/ml) 


2 839 


2 735.7 


PHA(1%) 


10 378 


10 274.7 


Pho A control 
(10 pg/ml) 


26.7 


0 


Background 


103.3 


0 
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V. CLONING OF FULL LENGTH M YCOBA CTERJ UM TUBERCULOSIS T-CE L L EPITOPE ORFS 

Most the sequences presented represent only pan of a larger M. tuberculosis ORF. If desired, the full 

length M. tuberculosis ORFs that include these provided nucleotide sequences can be readily obtained by one of 

ordinary skill in the art, based on the sequence data provided herein. 
A. GENERAL METHODOLOGIES 

Methods for obtaining full length genes based on partial sequence information are standard in the an and 
are panicularly simple for prokaryotic genomes. By way of example, the full length ORFs corresponding to the 
DNA sequences presented herein may be obtained by creating a library of Mycobacterium tuberculosis DNA in a 
plasmid, bacteriophage or phagemid vector and screening this library with a hybridization probe using standard 
colony hybridization techniques. The hybridization probe consists of an oligonucleotide derived from a DNA 
sequence according to the present invention labelled with a suitable marker to enable detection of hybridizing 
clones. Suitable markers include radionuclides, such as P-32 and non-radioactive markers, such as biotin. 
Methods for constructing suitable libraries, production and labelling of oligonucleotide probes and colony 
hybridization are standard laboratory procedures and are described in standard laboratory manuals such as in 
reference nos. 15 and 16. 

Having identified a clone that hybridizes with the oligonucleotide, the clone is identified and sequenced 
using standard methods such as described in Chapter 13 of reference no. 15. Determination of the translation 
initiation point of the DNA sequence enables the ORF to be located. 
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An alternative approach to cloning the full length ORFs corresponding to the DNA sequences provided 

herein is the use of the polymerase chain reaction (PCR). In particular, the inverse polymerase chain reaction 

(IPCR) is useful to isolate DNA sequences flanking a known sequence. Methods for amplification of flanking 

sequences by IPCR are described in Chapter 27 of reference no. 17 and in reference no. 23. 

Accordingly, one aspect of the present invention is small oligonucleotides encompassed by the DNA 

sequences presented in the Sequence Listing. These small oligonucleotides are useful as hybridization probes and 

PCR primers that can be employed to clone the corresponding full length Mycobacterium tuberculosis ORFs. In 

preferred embodiments, these oligonucleotides will comprise at least 15 contiguous nucleotides of a DNA sequence 

set forth in the Sequence Listing, and in more preferred embodiments, such oligonucleotides will comprise at least 

20 contiguous nucleotides of a DNA sequence set forth in the Sequence Listing. 

One skilled in the an will appreciate that hybridization probes and PCR primers are not required to exactly 

match the target gene sequence to which they anneal. Therefore, in another embodiment, the oligonucleotides will 

comprise a sequence of at least 15 nucleotides and preferably at least 20 nucleotides, the oligonucleotide sequence 

being substantially similar to a DNA sequence set forth in the Sequence Listing. Preferably, such oligonucleotides 

15 will share at least about 75%-90% sequence identity with a DNA sequence set forth in the Sequence Listing and 

more preferably the shared sequence identity will be greater than 90%. 

B. EXAMPLE - CLONING OF THE FULL LENGTH ORF CORRESPONDING TO CLONE HinP 
#2-92 

Using the techniques described below, the full length gene corresponding to the clone HinP #2-92 was 
obtained. This gene, herein termed mtbl-92 includes an open-reading frame of 1089 bp (identified based on the 
G + C content relating to codon position). The alternative 'GTG' start codon was used, and this was preceded 
(8 bps upstream) by a Shine-Dalgarno motif. The gene mtb2~92 encoded a protein (termed MTB2-92) containing 
363 amino acid residues with a predicted molecular weight of 40.436.4 Da. 

Sequence homology comparisons of the predicted amino acid sequence of MTB2-92 with known proteins in 
25 the database indicated similarity to the cytochrome c oxidase subunit II of many different organisms. This integral 
membrane protein is pan of the electron transport chain, subunits I and II forming the functional core of the 
enzyme complex. 

1. CLONING THE FULL LENGTH GENE CORRESPONDING TO HinP #2-92 
The plasmid pHin2-92 was restricted with either BamWX or EcoRl and then subcloned into the vector M13. 
30 The insert DNA fragments were sequenced under the direction of M13 universal sequencing primers (Yanisch- 

Perron, C. et at., 1985) using the AmpliCycle thermal sequencing kit (Perkin Elmer. Applied Biosystems Division. 
850 Lincoln Centre Drive. Foster City, CA 94404. U.S.A.). The 5 -panial MTB2-92 DNA sequence was aligned 
using a GeneWorks (Intelligenetics. Mountain View, CA. U.S.A.) program. Based on the sequence data obtained, 
two oligomers were synthesized. These oligonucleotides ( 5 CCCAGCTTGTGATACAGGAGG 3 
35 5 GGCCTCAGCGCGGCTCCGGAGG 3 ) represented sequences upstream and downstream, over an 0.8 kb distance, 
of the sequence encoding the partial MTB2-92 protein in the alkaline phosphatase fusion. 

A Mycobacterium tuberculosis genomic cosmid DNA library was screened using PCR (Sambrook. J. et at.. 
1989) in order to obtain the full-length gene encoding the MTB2-92 protein. Two hundred and ninety-four 
bacterial colonies containing the cosmid library were pooled into 10 groups in 100 M l distilled water aliquots and 
boiled for 5 min. The samples were spun in a microfuge at maximal speed for 5 min. The supernatants were 
decanted and stored on ice prior to PCR analysis. 

The 100 /il-PCR reaction contained: 10 /xl supernatant containing cosmid DNA, 10 >zl of 10X PCRr buffer. 
250 j*M dNTP*s. 300 nM downstream and upstream primers. 1 unit Tag DNA polymerase. 
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The reactions were heated at 95 °C for 2 min and then 40 cycles of DNA synthesis were performed (95 °C 
for 30 s, 65°C for 1 min, 72°C for 2 min). The PCR products were loaded into a 1% agarose gel in TAE buffer 
(Sambrook, J. etat., 1989) for analysis. 

The supernatant, which produced 800 bp PCR products, was then runner divided into 10 samples and the 
5 PCR reactions were performed again. The colony which had resulted in the correctly sized PCR product was then 
picked. The cosmid DNA from the positive clone (pG3) was prepared using the Wizard Mini-Prep Kit (Promega 
Corp, Madison, WI, U.S.A.). The cosmid DNA was further sequenced using specific oligonucleotide primers. 
The deduced amino acid sequence encoded by the MTB2-92 protein is shown in Fig. 1. 

2. EXPRESSION OF THE FULL LENGTH GENE 

10 To conveniently purify the recombinant protein, a histidine tag coding sequence was engineered 

immediately upstream of the start codon of mtbl -92 using PCR. Two unique restriction enzyme sites for Xbal and 
HindlU were added to both ends of the PCR product for convenient subcloning. Two oligomers were used to 
direct the PCR reaction: ("TCTAGACACCACCACCACCACCACGTGACACCTCGCGGGCCAGGTC 3 and 
5 AAGCTTCGCCATGCCGCCGGTAAGCGCC 3 ) 

15 The 100 til PCR reaction contained: 1 /zg pG3 template DNA, 250 >M dNTP's. 300 nM of each primer, 

10 y\ of 10X PCR buffer, 1 unit Tag DNA polymerase. The PCR DNA synthesis cycle was performed as above. 

The 1.4 kb PCR products were purified and ligated into the cloning vector pGEM-T (Promega). Inserts 
were removed by digestion using both the Xbal and HindlU and the 1.4 kb fragment was directionally subcloned 
into the Xbal and HindlU sites of pMAL-c2 vector (New England Bio- Labs Ltd.. 3397 American Drive, Unit 12. 

20 Mississauga, Ontario, L4V IT8. Canada). The gene encoding MTB2-92 was fused, in frame, downstream of the 
maltose binding protein (MBP). This expression vector was named pMAL-MTB2-92. 

3. PURIFICATION OF THE ENCODED PROTEIN 

The plasmid pMAL-MTB2-92 was transformed into competent E. coli JM109 cells and a 1 litre culture 
was grown up in LB broth at 37 °C to an OD sso of 0.5 to 0.6. The expression of the gene was induced by the 
25 addition of IPTG (0.5 mM) to the culture medium, after which the culture was grown for another 3 hours at 37°C 
with vigorous shaking. Cultures were spun in the centrifuge at 10.000 g for 30 min and the cell pellet was 
harvested. This was re-suspended in 50 ml of 20 mM Tris-HCl, pH 7.2, 200 mM NaCl. 1 mM EDTA 
supplemented with 10 mM 6 mercapioethanol and stored at — 20°C. 

The frozen bacterial suspension was thawed in cold water (0°C). placed in an ice bath, and sonicated. The 
30 resulting cell lysate was then centrifuged at 10,000 g and 4°C for 30 min, the supernatant retained, diluted with 

5 volumes of buffer A and applied to an amylose-resin column (New England Bio-Labs Ltd., 3397 American 
Drive, Unit 12. Mississauga. Ontario, L4V 1T8. Canada) which had been pre-equilibrated with buffer A. The 
column was then washed with buffer A until the eluate reached an A^ of 0.001 at which point, the bound MBP- 
MTB2-92 fusion protein was eluted with buffer A containing 10 mM maltose. The protein purified by the 

35 amylose-resin affinity column was about 84 kDa which corresponded to the expected size of the fusion protein 
(MBP: 42 kDa. MTB2-92 plus the histidine tag: 42 kDa). 

The eluted MBP-MTB2-92 fusion protein was then cleaved with factor Xa to remove the MBP from the 
MTB2-92 protein. One ml of fusion protein (1 mg/ml) was mixed with 100 /xl of factor Xa (200 ^e/ml) and kept 
at room temperature overnight. The mixture was diluted with 10 ml of buffer B (5 mM imidazole, 0.5 M NaCl 

40 20 mM Tris-HCl. pH 7.9. 6 M urea) and urea was added to the sample to a final concentration of 6 M urea. The 
sample was loaded onto the Ni-NTA column (QIAGEN, 9600 De Soto Ave., Chatsworth. CA 91311. U.S.A.) pre- 
equilibrated with buffer B. The column was washed with 10 volumes of buffer B and 6 volumes of buffer C 
(60 mM imidazole . 0.5 M NaCl. 20 mM Tris-HCl. pH 7.9, 6 M urea). The bound protein was eluted with 

6 volumes of buffer D (1 M imidazole. 0.5 M NaCl. 20 mM Tris-HCl. pH 7.9, 6 M urea). 
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At each stage of the protein purification, a sample was analysed by SDS poiyacylamide gel electrophoresis 
(Laemmli, U.S. (1970) Nature (London), 227:680-685) (see Fig. 2). 
C. CORRECTION OF SEQUENCE ERRORS 

It is noted that some of the sequences presented in the Sequence Listing contain sequence ambiguities. 
5 Naturally, in order to ensure that the immunostimulatory function is maintained, one would utilize a sequence 
without such ambiguities. For those sequences containing ambiguities, one would therefore utilize the sequence 
data provided in the Sequence Listing to design primers corresponding to each terminal of the* provided sequence 
and. using these primers in conjunction with the polymerase chain reaction, synthesize the desired DNA molecule 
using M. tuberculosis genomic DNA as a template. Standard PCR methodologies, such as those described above, 

10 may be used to accomplish this. 

VI. EXPRESSION AND PURIFICATION OF THE CLONED PEPTIDES 

Having provided herein DNA sequences encoding Mycobacterium tuberculosis peptides having an 
immunostimulatory activity, as well as the corresponding full length Mycobacterium tuberculosis genes, one of skill 
in the an will be able to express and purify the peptides encoded by these sequences. Methods for expressing 

15 proteins by recombinant means in compatible prokaryotic or eukaryotic host cells are well known in the an and are 
discussed, for example, in reference nos. 15 and 16. Peptides expressed by the nucleotide sequences disclosed 
herein are useful for preparing vaccines effective against M. tuberculosis infection, for use in diagnostic assays and 
for raising antibodies that specifically recognize M. tuberculosis proteins. One method of purifying the peptides is 
that presented in pan V(B) above. 

20 The most commonly used prokaryotic host cells for expressing prokaryotic peptides are strains of 

Escherichia coli. although other prokaryotes. such as Bacillus subiilis Sireptomyces or Pseudomonas may also be 
used, as is well known in the an. Panial or full-length DNA sequences, encoding an immunostimulatory peptide 
according to the present invention, may be ligated into bacterial expression vectors. One aspect of the present 
invention is thus a recombinant DNA vector including a nucleic acid molecule provided by the present invention. 

25 Another aspect is a transformed cell containing such a vector. 

Methods for expressing large amounts of protein from a cloned gene introduced into Escherichia coli 
(£. coli) may be utilized for the purification of the Mycobacterium tuberculosis peptides. Methods and plasmid 
vectors for producing fusion proteins and intact native proteins in bacteria are described in reference no. 15 
(ch. 17). Such fusion proteins may be made in large amounts, are relatively simple to purify, and can be used to 

30 produce antibodies. Native proteins can be produced in bacteria by placing a strong, regulated promoter and an 

efficient ribosome binding site upstream of the cloned gene. If low levels of protein are produced, additional steps 
may be taken to increase protein production; if high levels of protein are produced, purification is relatively easy. 

Often, proteins expressed at high levels are found in insoluble inclusion bodies. Methods for extracting 
proteins from these aggregates are described in ch. 17 of reference no. 15. Vector systems suitable for the 

35 expression of lacZ fusion genes include the pUR series of vectors (24), pEXl-3 (25) and pMRlOO (26). Vectors 
suitable for the production of intact native proteins include pKC30 (27). pKK 177-3 (28) and pET-3 (29). Fusion 
proteins may be isolated from protein gels, lyophiltzed, ground into a powder and used as antieen preparations. 

Mammalian or other eukaryotic host cells, such as those of yeast, filamentous fungi, plant, insect, 
amphibian or avian species, may also be used for protein expression, as is well known in the an. Examples of 

40 commonly used mammalian host cell lines are VERO and HeLa cells. Chinese hamster ovary (CHO) cells, and 
WI38. BHK. and COS cell lines, although it will be appreciated by the skilled practitioner that other prokaryotic 
and eukaryotic cells and cell lines may be appropriate for a variety of purposes, e.g., to provide higher expression, 
desirable glycosylation patterns, or other features. 
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VII. SEQUENCE VARIANTS 

It will be apparent to one skilled in the an that the immunostimulatory activity of the peptides encoded by 
the DNA sequences disclosed herein lies not in the precise nucleotide sequence of the DNA sequences, but rather 
in the epitopes inherent in the amino acid sequences encoded by the DNA sequences. It will therefore also be 
apparent that it is possible to recreate the immunostimulatory activity of one of these peptides by recreating the 
epitope, without necessarily recreating the exact DNA sequence. This could be achieved either by directly 
synthesizing the peptide (thereby circumventing the need to use the DNA sequences) or, alternatively, by designing 
a nucleic acid sequence that encodes for the epitope, but which differs, by reason of the redundancy of the genetic 
code, from the sequences disclosed herein. 

Accordingly, the degeneracy of the genetic code further widens the scope of the present invention as it 
enables major variations in the nucleotide sequence of a DNA molecule while maintaining the amino acid sequence 
of the encoded protein. The genetic code and variations in nucleotide codons for particular amino acids is 
presented in Tables 5 and 6. Based upon the degeneracy of the genetic code, variant DNA molecules may be 
derived from the DNA sequences disclosed herein using standard DNA mutagenesis techniques, or by synthesis of 
15 DNA sequences. 
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TABLE 5 
The Genetic Code 



5 


First 
Position 
(5' end) 




Second 


Position 




Third 
Position 
(3' end) 


10 




T 


C 


A 


■* - 

G 








Phe 


Ser 


Tyr 


Cys 


T 






Phe 


Ser 


Tyr 


Cys 


C 


15 


T 


Leu 


Ser 


Stop (och) 


Stop 


A 






[Leu 


Ser 


Stop (amb) 


Trp 


I- G 






Leu 


Pro 


Kis 


Arg 


T 


20 




Leu 


Pro 


Kis 


Arg 


C 




C 


Leu 


Pro 


Gin 


Arg 


A 






Leu 


Pro 


Gin 


Arg 


[ G 


25 


















He 


Thr 


Asn 


Ser 


T 






He 


Thr 


Asn 


Ser 


C 




A 


He 


Thr 


Lys 


Arg 


A 


30 




Met 


Thr 


Lys 


Arg 


G 




| Val 


Ala 


Asp 


Gly | 


T 


35 




Val 


Ala 


Asp 


Gly 


C 


G 


Val 


Ala 


Glu 


Gly 


A 




1 


Val (Met) 


Ala 


Glu 


Gly 


G 



40 



Stop (och) " stands for the ocre termination triplet and 
"Stop (amb)" for the amber. ATG is the most common 
initiator codon; GTG usually codes for valine, but it can 
also code for methionine to initiate an mRNA chain 
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TABLE 6 

The Degeneracy of the Genetic Code 



10 



Number of 

Synonymous 

Codons 



Amino Acid 



Total 
Number of 
Codons 



15 



20 



6 


Leu, Ser, 


Arg 




18 


4 


Gly, Pro, 


Ala, Val, 


Thr 


20 


3 


He 






3 


2 


Phe , Tyr , 
Glu, Asn, 


Cys, His, 
Asp, Lys 


Gin, 


18 


1 


Met, Trp 






2 


Total 


number of codons for amino 


acids 




61' 



Number of codons for termination 
Total number of codons in genetic code 



_3 
64 
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Additionally, standard mutagenesis techniques may be used to produce peptides which vary in amino acid 
sequence from the peptides encoded by the DNA molecuies disclosed herein. However, such peptides will retain 
the essential characteristic of the peptides encoded by the DNA molecules disclosed herein, i.e. the ability to 
stimulate INF--y production. This characteristic can readily be determined by the assay technique described above. 
Such variant peptides include those with variations in amino acid sequence including minor deletions, additions and 
substitutions. 

While the site for introducing an amino acid sequence variation is predetermined, the mutation per se need 
not be predetermined. For example, in order to optimize the performance of a mutation at a given site, random 
mutagenesis may be conducted at the target codon or region and the expressed protein variants screened for the 
optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in 
DNA having a known sequence as described above are well known. 

In order to maintain the functional epitope, preferred peptide variants will differ by only a small number of 
amino acids from the peptides encoded by the DNA sequences disclosed herein. Preferably, such variants will be 
amino acid substitutions of single residues. Substitutional variants are those in which at least one residue in the 
amino acid sequence has been removed and a different residue inserted in its place. Such substitutions eenerally 
are made in accordance with the following Table 7 when it is desired to finely modulate the characteristics of the 
protein. Table 7 shows amino acids which may be substituted for an original amino acid in a protein and which 
are regarded as conservative substitutions. As noted, all such peptide variants are tested to confirm that they retain 
the ability to stimulate INF-7 production. 
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TABLE 7 



5 Original Residue Conservative Substitutions 



Ala ser 

Arg l ys 

10 Asn gin, his 

Asp glu 

Cys ser 

Gin asn 

Glu asp 

15 Gly pro 

His asn; gin 

He leu, val . 

Leu iie ; val 

L ys arg; gin; glu 

20 Met leu; ile 

Phe met; leu; tyr 

Ser thr 

Thr ser 

Trp tyr 

25 Tyr trp; phe 

Val ile; leu 
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Substantia! changes in immunological identity are made by selecting substitutions that are less conservative 
than those in Table 7, i.e., selecting residues that differ more significantly in their effect on maintaining (a) the 
structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical 
conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in protein properties will be those 
in which (a) a hydrophilic residue, e.g., seryl or threonyl. is substituted for (or by) a hydrophobic residue, e.g.. 
leucyl. isoleucyl. phenylalanyl. valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; 
(c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g.. phenylalanine, 
is substituted for (or by) one not having a side chain, e.g., glycine. However, such variants must retain the ability 
to stimulate INF-> production. 
40 VIII. USE OF CLONED MYCOBACTERIUM SEQUENCES TO PRODUCE VACCINES 

The purified peptides encoded by the nucleotide sequences of the present invention may be used directly as 
immunogens for vaccination. The conventional tuberculosis vaccine is the BCG (bacille Calmette-Guerin) vaccine, 
which is a live vaccine comprising attenuated Mycobacterium bovis bacteria. However, the use of this vaccine in a 
number of countries, including the U.S., has been limited because administration of the vaccine interferes with the 
45 use of the tuberculin skin test to detect infected individuals (see Cecil Textbook of Medicine (Ref/33). pages 1733- 
1742 and section VIII (2) below). 

The present invention provides a possible solution to the problems inherent in the use of the BCG vaccine 
in conjunction with the tuberculin skin test. The solution proposed is based upon the use of one or more of the 
imrnunostimulatory M. tuberculosis peptides disclosed herein as a vaccine and one or more different 
immunostimuiatory M. tuberculosis peptides disclosed herein in the tuberculosis skin test (see section IX (2) 
below). If uic immune system is primed with such a vaccine, it will be able to resist an infection by M. 
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tuberculosis. However, exposure to the vaccine peptides alone will not induce an immune response to those 
peptides that are reserved for use in the tuberculin skin test. Thus, the present invention would allow the clinician 
to distinguish between a vaccinated individual and an infected individual. 

Methods for using purified peptides as vaccines are well known in the an and are described in the 
following publications: Pal and Horwitz (1992) (reference no. 8) (describing immunization with extra-cellular 
proteins of Mycobacterium tuberculosis); Yang et at. (1991) (reference no. 30) (vaccination with synthetic peptides 
corresponding to the amino acid sequence of a surface glycoprotein from Leishmania major); "Andersen (1994) 
(reference no. 9) (vaccination using short-term culture filtrate containing proteins secreted by Mycobacterium 
tuberculosis): and Jardim et al. (1990) (reference no. 10) (vaccination with synthetic T-cell epitopes derived from 
Leishmania parasite). Methods for preparing vaccines which contain immunogenic peptide sequences are also 
disclosed in U.S. Patent Nos. 4,608,251, 4.601.903, 4.599,231, 4,5995230, 4,596,792 and 4,578,770. The 
formulation of peptide-based vaccines employing M. tuberculosis peptides is also discussed extensively in 
International Patent application WO 95/01441. 

As is well known in the art, adjuvants such as Complete Freund's Adjuvant (CFA) and Incomplete 
15 Freund's Adjuvant (IFA) may be used in formulations of purified peptides as vaccines. Accordingly, one 

embodiment of the present invention is a vaccine comprising one or more immunostimulatory M. tuberculosis 
peptides encoded by genes including a sequence shown in the attached sequence listing, together with a 
pharmaceutically acceptable adjuvant. 

Additionally, the vaccines may be formulated using a peptide according to the present invention together 
with a pharmaceutically acceptable excipient such as water, saline, dextrose and glycerol. The vaccines may also 
include auxiliary substances such as emulsifying agents and pH buffers. 

It will be appreciated by one of skill in the an that vaccines formulated as described above may be 
administered in a number of ways including subcutaneous, intra-muscular and intra-venous injection. Doses of the 
vaccine administered will vary depending on the antigenicity of the panicular peptide or peptide 
25 combination employed in the vaccine, and characteristics of the animal or human patient to be vaccinated. While 
the determination of individual doses will be within the skill of the administering physician, it is anticipated that 
doses of between I microgram and I milligram will be employed. 

As with many vaccines, the vaccines of the present invention may routinely be administered several times 
over the course of a number of weeks to ensure that an effective immune response is triggered. As described in 
30 International Patent Application WO 95/01441. up to six doses of the vaccine may be administered over a course 
of several weeks, but more typically between one and four doses are administered. Where such multiple doses are 
administered, they will normally be administered at from two to twelve week intervals, more usually from three to 
five week intervals. Periodic boosters at intervals of 1-5 years, usually three years, will be desirable to maintain 
the desired levels of protective immunity. 
35 As described in WO 95/01441, the course of the immunization may be followed by in vitro proliferation 

assays of PBL (peripheral blood lymphocytes) co-cultured with ESAT6 or ST-CF. and especially by measuring the 
levels of IFN-7 released from the primed lymphocytes. The assays are well known and are widely described in the 
literature, including in U.S. Patent Nos. 3.791.932; 4.174.384 and 3.949.064. 

To ensure an effective immune response against tuberculosis infection, vaccines according to the present 
40 invention may be formulated with more than one immunostimulatory peptide encoded by the nucleotide sequences 
disclosed herein. In such cases, the amount of each purified peptide incorporated into the vaccine will be adjusted 
accordingly. 

Alternatively, multiple immunostimulatory peptides may also be administered by expressing the nucleic 
acids encoding the peptides in a nonpathogenic microorganism, and using this transformed nonpathogenic 
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microorganism as a vaccine. As described in International Patent Application WO 95/01441, Mycobacterium 
bovis BCG may be employed for this purpose, although this approach would destroy the advantage outlined above 
to be gained from using separate classes of the peptides as vaccines and in the skin test. As disclosed in 
WO 95/01441, an immunostimuiatory peptide of M. tuberculosis can be expressed in the BCG bacterium by 
transforming the BCG bacterium with a nucleotide sequence encoding the M. tuberculosis peptide. Thereafter, the 
BCG bacteria can be administered in the same manner as a conventional BCG vaccine. In particular embodiments 
multiple copies of the M. tuberculosis sequence are transformed into the BCG bacteria to enhance the amount of M. 
tuberculosis peptide produced in the vaccine strain. 

IX. USE OF CLONED MYCOBACTERIUM SEQUENCES IN DIAGNOSTIC ASSAYS 

Another aspect of the present invention is a composition for diagnosing tuberculosis infection wherein the 
composition includes peptides encoded by the nucleotide sequences of the present invention. The invention also 
encompasses methods and compositions for detecting the presence of anti-tuberculosis antibodies, tuberculosis 
peptides and tuberculosis nucleic acid sequences in body samples. Three examples typify the various techniques 
that may be used to diagnose tuberculosis infection using the present invention: an in vitro ELISA assay, an in 
15 vivo skin test assay and a nucleic acid amplification assay. 
A. IN VITRO ELISA ASSAY 

One aspect of the invention is an ELISA thai detects anti-tuberculosis mycobacterial antibodies in a medical 
specimen. An immunostimuiatory peptide encoded by a nucleotide sequence of the present invention is employed 
as an antigen and is preferably bound to a solid matrix such as a crosslinked dextran such as SEPHADEX 
(Pharmacia, Piscataway. NJ), agarose, polystyrene, or the wells of a microliter plate. The polypeptide is admixed 
with the specimen, such as human sputum, and the admixture is incubated for a sufficient time to allow 
amimycobacterial antibodies present in the sample to immunoreact with the polypeptide. The presence of the 
immunopositive immunoreaciion is then determined using an ELISA assay. 

In a preferred embodiment, the solid support to which the polypeptide is attached is the wall of a microliter 
25 assay plate. After attachment of the polypeptide, any nonspecific binding sites on the microliter well walls are 

blocked with a proiein such as bovine serum albumin. Excess bovine serum albumin is removed by rinsing and the 
medical specimen is admixed with the polypeptide in the microliter wells. After a sufficient incubation time, the 
microliter wells are rinsed to remove excess sample and then a solution of a second antibody, capable of detecting 
human antibodies is added to the wells. This second antibody is typically linked to an enzyme such as peroxidase, 
30 alkaline phosphatase or glucose oxidase. For example, the second antibody may be a peroxidase-labeled goat ami- 
human antibody. After further incubation, excess amounts of the second antibody are removed by rinsing and a 
solution containing a substrate for the enzyme label (such as hydrogen peroxide for the peroxidase enzyme) and a 
color-forming dye precursor, such as o-phenyienediamine is added. The combination of mycobacterium peptide 
(bound to the wall of the well), the human amimycobacterial antibodies (from the specimen), the enzyme- 
35 conjugated anti-human antibody and the color substrate will produce a color than can be read using an instrument 
that determines optical density, such as a spectrophotometer. These readings can be compared to a control 
incubated with water in place of the human body sample, or, preferably, a human body sample known to be free of 
amimycobacterial antibodies. Positive readings indicate the presence of anti-mycobacterial antibodies in the 
specimen, which in turn indicate a prior exposure of the patient to tuberculosis. 
40 B. SKIN TEST ASSAY 

Alternatively, the presence of tuberculosis antibodies in a patient's body may be detected using an 
improved form of the tuberculin skin test, employing immunostimuiatory peptides of the present invention. 
Conventionally, this test produces a positive result to one of the following conditions: the current presence of M. 
tuberculosis in the patient's body: past exposure of the patient to M. tuberculosis: and prior BCG vaccination. As 
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noted above, if one group of immunostimulatory peptides is reserved for use in vaccine preparations, and another 
group reserved for use in the improved skin test, then the skin test will not produce a positive response in 
individuals whose only exposure to tuberculosis antigens was via the vaccine. Accordingly, the improved skin test 
would be able to properly distinguish between infected individuals and vaccinated individuals. 

The tuberculin skin test consists of an injection of proteins from M. tuberculosis that are injected 
intradermally. The test is described in detail in Cecil Textbook of Medicine (Ref 33), pages 1733-1742. If the 
subject has reactive T-cells to the injected protein, the cells will migrate to the site of injection and cause a local 
inflammation. This inflammation, which is generally known as delayed type hypersensitivity (DTH) is indicative 
of M. tuberculosis antibodies in the patient's blood stream. Purified immunostimulatory peptides according to the 
present invention may be employed in the tuberculin skin test using the methods described in reference 33. 

C. NUCLEIC ACID AMPLIFICATION 

One aspect of the invention includes nucleic acid primers and probes derived from the sequences set forth 
in the attached sequence listing, as well as primers and probes derived from the full length genes that can be 
obtained using these sequences. These primers and probes can be used to detect the presence of M. tuberculosis 
nucleic acids in body samples and thus to diagnose infection. Methods for making primers and probes based on 
these sequences are well known and are described in section V above. 

The detection of specific pathogen nucleic acid sequences in human body samples by polymerase chain 
reaction amplification (PCR) is discussed in detail in reference 17, in particular, pan four of that reference. To 
detect M. tuberculosis sequences, primers based on the sequences disclosed herein would be synthesized, such that 
PCR amplification of a sample containing Af. tuberculosis DNA would result in an amplified fragment of a 
predicted size. If necessary, the presence of this fragment following amplification of the sample nucleic acid could 
be detected by dot blot analysis (see chapter 48 of reference 17). PCR amplification employing primers based on 
the sequences disclosed herein may also be employed to quantify the amounts of M . tuberculosis nucleic acid 
present in a particular sample (see chapters 8 and 9 of reference 17). Reverse-transcription PCR using these 
primers may also be utilized to detect the presence of M. tuberculosis RNA, indicative of an active infection. 

Alternatively, probes based on the nucleic acid sequences described herein may be labelled with suitable 
labels (such a P 3: or biotin) and used in hybridization assays to detect the presence of M. tuberculosis nucleic acid 
in provided samples. 

X. USE OF CLONED MYCOBACTERIUMSEQUENCZS TO RAISE ANTIBODIES 

Monoclonal antibodies may be produced to the purified M. tuberculosis peptides for diagnostic purposes. 
Substantially pure M. tuberculosis peptide suitable for use as an immunogen is isolated from the transfected or 
transformed cells as described above. The concentration of protein in the final preparation is adjusted, for 
example, by concentration on an Amicon filter device, to the level of a few milligrams per milliliter. Monoclonal 
antibody to the protein can then be prepared as follows: 

A. MONOCLONAL ANTIBODY PRODUCTION BY HYBRIDOMA FUSION, 

Monoclonal antibody to epitopes of the M. tuberculosis peptides identified and isolated as described can be 
prepared from murine hybridomas according to the classical method of Kohler and Milstein (1975) or derivative 
methods thereof Briefly, a mouse is repetitively inoculated with a few micrograms of the selected purified protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody-producing cells of the spleen 
isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess 
unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The 
successfully fused cells are diluted and aiiquots of the dilution placed in wells of a microliter plate where growth of 
the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid 
of the wells by immunoassay procedures, such as ELISA. as originally described by Engvall (1980). and derivative 
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methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for 
use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (1988). 
B. ANTIBODIES RAISED AGAINST SYNTHETIC PEPTIDES. 

An alternative approach to raising antibodies against the M. tuberculosis peptides is to use synthetic 
5 peptides synthesized on a commercially available peptide synthesizer based upon the amino acid sequence of the 
peptides predicted from nucleotide sequence data. 

In a preferred embodiment of the present invention, monoclonal antibodies that recognize a specific M. 
tuberculosis peptide are produced. Optimally, monoclonal antibodies will be specific to each peptide, i.e. such 
antibodies recognize and bind one M. tuberculosis peptide and do not substantially recognize or bind to other 

10 proteins, including those found in healthy human cells. 

The determination that an antibody specifically detects a particular M. tuberculosis peptide is made by any 
one of a number of standard immunoassay methods; for instance, the Western blotting technique (Sambrook et al., 
1989). To determine that a given antibody preparation (such as one produced in a mouse) specifically detects one 
M. tuberculosis peptide by Western blotting, total cellular protein is extracted from a sample of human sputum 

15 from a healthy patient and from sputum from a patient suffering from tuberculosis. As a positive control, total 
cellular protein is also extracted from M. tuberculosis ceils grown in vitro. These protein preparations are then 
electrophoresed on a sodium dodecyl sul fate-poly acrylamide gel. Thereafter, the proteins are transferred to a 
membrane (for example, nitrocellulose) by Western blotting, and the antibody preparation is incubated with the 
membrane. After washing the membrane to remove non-specifically bound antibodies, the presence of specifically 

20 bound antibodies is detected by the use of an anti-mouse antibody conjugated to an enzyme such as alkaline 

phosphatase; application of the substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the 
production of a dense blue compound by immu no- localized alkaline phosphatase. Antibodies which specifically 
detect the M. tuberculosis protein will, by this technique, be shown to bind to the A,' tuberculosis -extracted sample 
at a particular protein band (which will be localized at a given position on the gel determined by its molecular 

25 weight) and to the proteins extracted from the sputum from the tuberculosis patient. No significant binding will be 
detected to proteins from the healthy patient sputum. Non-specific binding of the antibody to other proteins may 
occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be 
recognized by one skilled in the an by the weak signal obtained on the Western blot relative to the strong primary 
signal arising from the specific antibody-tuberculosis protein binding. Preferably, no antibody would be found to 

30 bind to proteins extracted from healthy donor sputum. 

Antibodies that specifically recognize a M. tuberculosis peptide encoded by the nucleotide sequences 
disclosed herein are useful in diagnosing the presence of tuberculosis antigens in patients. 

Ali publications and published patent documents cited in this specification are incorporated herein by 
reference to the same extent as if each individual publication or patent application was specifically and individually 

35 indicated to be incorporated by reference. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANTS: UNIVERSITY OF VICTORIA 
5 (ii) TITLE OF INVENTION: MYCOBACTERIUM TUBERCULOSIS DNA 

"SEQUENCES ENCODING IMMUNOSTIMULATORY ~ PEPTIDES 

(iii) NUMBER OF SEQUENCES: 76 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Klarquist Sparkman Campbell Leigh 
10 & Whinston, LLP 

(B) STREET: One World Trade Center, Suite 1600, 
121 S.W. Salmon Street 

(C) CITY: Portland 

(D) STATE: OR 

15 (E) COUNTRY: USA 

(F) ZIP: 97204-2988 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Disk, 3. 5 -inch 

(B) COMPUTER: IBM PC compatible 
20 (C) OPERATING SYSTEM: MS DOS 

(D) SOFTWARE: WordPerfect 5.1+ 

(vi) . CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

25 (C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 06/000,254 

(B) FILING DATE: 06/15/95 

(viii) ATTORNEY/AGENT INFORMATION 

30 (A) NAME: Richard J. Polley 

(B) REGISTRATION NUMBER: 28,107 

(C) REFERENCE/DOCKET NUMBER: 2 84 7 - 4 5 176 /RJP 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (503) 226-7391 

35 (B) TELEFAX: (503) 228-9446 
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(2) INFORMATION FOR SEQ ID NO: 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 265 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
10 (ix) FEATURE: 

(D) OTHER INFORMATION:' AciI#l-62 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 1 

ACGCGGACCT CGAAGTTCAT CATCGAGTGA TACGTGCCAC ACATCTCGGC ■ 50 
GCAGTGGCCC ACGAATGCAN CCGGTCTTGG TGATTTCNTC GATCTGGAAG 100 

15 ACGTTGACCG ARTTGTTTGC CACCGGGTTA GGCATCACGT CACGCTTGAA 150 
CAAGAACTCC GGCACCCAGA ATGCGTGTGT CACATCGGCT GAGGCCATTT 20 0 
GGAATTCGAT ACGCTTGCCG GACGGCAGCA CCAGCACCGG AATTTCGGTG 250 
CTGTGCAACG TCTCG 2 65 

(2) INFORMATION FOR SEQ ID NO : 2 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 84 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#l-152 
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 2 

CTGGTACGAC GCCGGCAAGG ACTACGGACG AGGTGGCACA GAATTCAATG 50 
CGGCGCTCAT CGGAACCGAC GTGCCCGACG NCGTTTGCTC GACGACGATG 10 0 
GTGNTTCCAN TTCGCCTNAN CGGTGTNCTG ACTGCCNTTG ACGACCTGNT 150 
CGGCCARGTT GGGNTGGACA CAACGGATTA CGTCGATTCG CTGCTGGCCG 200 
35 ACTATGAGTT CAACGGCCGC CATTACGCTG TGCCGTATGC TCGCTCGACG 250 
CCGCTGTTCT ACTACAACAA GGCGGCGTGG CAACAGGCCG GCCTACCCGA 3 00 
CCGCGGACCG CAATCCTGGT CAGAGTTCGA CGAGTGGGGT CCGGAGTTAC 3 50 
AGCGCGTGGT CGNCGCCGGT CGATCGGCGC ACGGCTGCGT AACGCCGACC 400 
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TCATCTCGTG GACGTTTCAG GGACCGAACT GGGCATNCGG CGGTGCCTAC 4 50 
TCCGACAAGT GGACATTGAC ATTGACCGAG CCCG 434 
(2) INFORMATION FOR SEQ ID NO: 3 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 513 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#l-23 9 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 

15 GGCGGCCAGA CGTCGGAACT CGCGGCCAAT TGGTGTGGTG GGAACCGCGA 50 
TCCTCGACGC AACCGCTTCG CGGTCTTGGC AGTGTTCGAT GCCAATCTGC 100 
CGGCCGGGAC GCTGCCGGAT GCGGCCCGTT CACCGAGGCT GGTGACAAGA 150 
CCTGGCGTTG TCGTTCCGGG CACTACTCCC NAGGTCGGTC AAGGCACCGT 2 00 
CAAAGTGTTC AGGTATACCG TCGAGATCGA GAACGGTCTT GATCCCACAA 2 50 

20 TGTACGGCGG TGACAANNNN ATTCGCCCAG ATGGTCGACC AGACGTTGAC 3 00 
CAATCCCAAG GGCTGGACCC ACAATCCGCA ATTCGGCGTT CGTGCGGATC 3 50 
GACAGCGGAA AACCCGACTT CCGGATTTCG CTGGTGTCGC CGACGACAGT 4 00 
GCGCGGGGGN TGTGGCTACG AATTCCGGCT CGAGACGTCC TGCTACAACC 4 50 
CGTCGTTCGG CGGCATGGAT CGCCAATCGC GGGTGTTCAT CAACGAGGCG 500 

25 CGCTGGGTAC GCG 513 
(2) INFORMATION FOR SEQ ID NO: 4 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#l-24 7 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 
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GTGTGCAACC AGTGTGTGTN CGTGTGCGAA CCAGTGTGTA GTGGTAACCA 5 0 
GGACCACGTT GCAAACCAGT GTTGGAGTGC AGTGTTGCGT GCNAGTGTTG 10 0 
CNCGTTGCAG TGTTNGNCGA GCCGAGATTG GAAGTTNCCG ACATTACCGT 15 0 
TGCCGACGTT GCCCTCGCCG ACGTTCGCCA AGCCCAGGTT GCGGACACGC 2 00 
5 CGGTGATTGT GCGTGGGGCA ATGACGGGCT GCTGGCCCGG CCGAATTCCA 25 0 
AGGCGTCGAT CGGCACGGTG TTCCAGGACC GGGCCGCTCG CTACGGTGAC 300 
CGAGTCTTCC TGAAATTCGG CGATCAGCAG CTGACCTACC GCGACCGTAA 350 
CGCCACCGCC AACCGGTNNG CCGCGGTGTT GGCCNNNCGC GGCGTCGGCC 400 
CCGGCGACGT CGTTGGCATC ATGTTGCGTA ACTCACCCAG CACAGTCTTG 45 0 
10 GCGATGCTGG CCACGGTCAA GTGCGGCGTA TCGCCGGCAT GCTCAACTAC 50 0 
CACCAGCGCG 510 
(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 56 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION : AciI#l-426 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 5 

GCAACGGAGA GGTGGACTAT GCCGGACCGG CACCGCGAAG GGGTTGGTGC 50 
25 CGGCCCGGGT GGTGACGGTG CACATTCTGC GCAATTCGCT GAGTTCCGGT 100 
GGTGACCTTC CTGGGCGCGG AGTCTGGGCG CGCTGATGGC GGAGCGAKTG 150 
TGACCGAAGG AANTCNGTTC AACATCCACG GCGTCGGGGG CGTGCTGTAT 200 
CAAGCGGTCA CCGTCAGGAG ACGCCGACGG TGGTGTCGAT CGTGACGGTG 2 50 
CTGGTGCTGA TCTACCTGAT CACCAATCTG TTGGTGGATC TGCTGTATGC 3 00 
30 GGCCCTGGAC GCCGNNGATN CGCTATGGCT GAGCACACGG GGTTCTGGCT 3 50 
CGATGCCTNG CGCGGGTTGC GCCGGCGTCC TAAANTCGTG ATCGCGCGGC 400 
GCTGAKCCTG CTGATTCTTG TCGTGGCGGC GTTTCCGTCG TTGTTTACCG 4 50 
CAGCCG 

(2) INFORMATION FOR SEQ ID NO: 6 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-2 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 

TCNCTTANYC CTTCANCTGN CATCTNTCCC AANNAC CG AA NTCTGGACCT 50 
ATSACGNCCA NCTNAANATG NCCCNCGACN AAGGNCNTTG NACGTTCNCT 100 
GKACCACCAN CGGGTTGCAT SCCAAGCTAG NCGAACATCA NASGTTNCGC 150 
GCNTACGAGC CGACCCGCCG CGGCG 175 
(2) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 231 ' 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

20 (A) ORGANISM : Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-23 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 7 

CTTCTCGCGC CAGCCGTCCC GCTGTCCGGG ATGCGCTACC GGTCGTCAGC 5 0 
25 GCCAAGACGG TGCAGCTCAA CGACGGCGGG TTGGTGCGCA CGGTGCACTT 10 0 
GCCGGCCCCC AATGTSGCGG GGCTGCTGAG TGCGGCCGCG TGCCGCTGTT 150 
GCAAANNGCG ACCACGTGGT GCCCGCCGCG ACGGCCCCGA TCGTCGAAGG 2 00 
CATGCAGATC CAGGTGACCC GCAAATCGGA T 231 
(2) INFORMATION FOR SEQ ID NO: 8 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
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(D) OTHER INFORMATION: AciI#2-26 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 8 

GTTCGNCGCG CTCAAAAGGT TGACGATGGT CACGTCGCAC GTGCTGGCCG 50 
AGACCAAGGT GGATTTCGGT GAAGACCTCA AAGANCTCTA CTCGNATCGT 100 
5 CAAGGCCCTC AACGACGACC GAAAGGATTT CGTCACCTCG CTGCAGCTGT 150 
TGCTGACGTT CCCATTTCCC AAC _ 173 

(2) INFORMATION FOR SEQ ID NO; 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 223 
io (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-35 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 9 

CCTGTTNCAA CGGTNCNTTC NCGGAACGGA CGACTTCTGA TNCGNNCTCG 5 0 
GNCGTTCCCT CGCACCGGTC GATGGTGATC AAGGTCAGCG TCTTCGCGGT 100 
GGTCATGCTG CTGGTGGCCG CCGGTCTGGT GGTGGTATTC GGGGACTTCC 150 
GGTTTGGTCC CACAACCGTC TACCACGCCA CCTTCACCGA CNCGTNGCGG 200 
CTGAANGCAG GCCAGAAGGT TCG 22 3 

(2) INFORMATION FOR SEQ ID NO : 10 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-272 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 10 

CAACGAGATC GCACCCGTGA TTAGGAGGTG ACGGTGGCAG CGCCGACCCC 50 
GTCGAATCGG ATCGAAGTAA CGCTCCGTAG ACGCCAGCTC GTCCGCGCCG 100 
ATGCCGACCT GCCACCCGTG 12 q 



20 
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(2) INFORMATION FOR SEQ ID NO: 11 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 

(B) TYPE: nucleic acid 

5 (C)' STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
10 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-506 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 11 

CNGGCNNCCA NCGGGTGCGC CAWGCACGGC CGGTCCGTGC GAGATCGTCN 50 
CNAATGGCAN GCCGGCGCCC AAKANANNNC CGGTACCGTG CCTTCGTNGW 100 
15 GCAWCCTNGC GACCAACCCC GAGATYGCYA CNCTACNGCC GGKACATGAC 150 
CGTGGTGCGG 160 
(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-508 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 12 

GACTGGNCCC GAYGYTGTGN CCGGHNCGTH GGNCGHGCHG CANTCGAYCC 5 0 
30 TGGCCGTTGC TTCGGTGCCG GGTTGTTCAT CGCCTTCGAC CAGTTGTGGC 100 
GCTGGAACAG CATAGTGGCG CTAGTGCTAT CGG 133 
(2) INFORMATION FOR SEQ ID NO: 13 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-511 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 13 

GCGNACNCTG CGCATNGCTG CCNGTANCCC GGCGCCNAGG CATGAGNCNN 50 
TAGGCCGAAA TGCCTGGTKA ANCTNGCGTG TSGTGGTTGA CCCGCNGCGT 100 
SCNGGCNTAC AKGTGCATGC TGTNGATCGG CAGTGGGAGA GGTGAGCGGT 150 
GCGGCGTNAA GGTGCGGAGG TTNGASNTCT GGCGGTGTCG GCGTTNGGTG 20 0 
GCTTTGTTCC CGGCGGTCGC GGGGTGCTCC NGNATTCCGG CGACNAACNA 2 50 
AANNCCGGGN AGSACGAYNC CCGTCGACAC CNGGCAAACG CTGAGGGCCG 3 00 
GCACGGACCC TTCTTCCCGC AATGTGGCGG CGTCAGCGAT CANGACGGTG 3 50 
ACCGAGCTGW ACAAGGGTGA CCGGGCTGGT CAACACCGCC AAGAAGTCGG 4 00 
TGGGCTNCCA ATGGCNTGGC G . * 421 

15 (2) INFORMATION FOR SEQ ID NO : 14 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
25 (D) OTHER INFORMATION: AciI#2-523 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 14 

CCAGNCCNCC NAACNTGTYN CGNTCTCAYY TCGCCGTCGC TGCCGGTNCG 50 
TGTGTGCACC ATCTGCACCG ACCCGTGKAA CYTCGATCAC GANACTGGNA 100 
GAGNTCAGGC ATNAAAG CCG GAGTGGCACA GCAACGGTCG CTACTGGAAT 150 
30 TGGCGAAGCT GGATGCTGAG CTGAC 175 
(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-63 9 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 15 

5 GGGCTGGATT - CGAGGCTCGT GCATGNCGTA CGACTANGGG TAGCGCCCAG 50 
CTGCTCAATA CCATCGGTTG GATAACAAAG GCTGAACATG AATGGCNTGA 100 
TCTCNACAAG CGTGCGGCTC CCACCGACCC CGGCGCCCCT CGAGCCTGGG 15 0 
GSTGTCGCGA TCCTGATCGC GGCGACACTT TTCGCGACTG TCGTTGCGGG 2 00 
GTGCGGGAAA AAACCGACCA CGGCGAGCTC CCGAGTCCCG GGTCGCCGTC 250 
10 GCCGGAAGCC CAC 263 
(2) INFORMATION FOR SEQ ID NO: 16 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 8 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: genomic DNA 
<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
20 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-822 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 16 

YGCCATGCGA AGCGCACCCC GGTCCGGAAG NCCTGCACAG TTCWNCCGTG 50 
CTCGCCGCGA CGCTACTCCT CGNYTGCGGC GGTCCCAYGC AGCCAYGCAG 100 
25 CATCACCTTG ACCTTTATCC GCAACGYGYA ATYCCAGGCC AAYGCCGAYG 150 
GGATCATCGA YACCKACA 168 
(2) INFORMATION FOR SEQ ID NO: 17 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-854 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 17 
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ACCNGTTCCC GCCGGNCTNA CNCNCGGTGC CGTTGCACCG GCCANCTGCA 50 
GCCTGCCCCG ACGCCGAAGT GGTGTTCGCN CCGCGGCCGC TTCGAACCGC 10 0 
CCGGGATTGG CACGGTCGGC AABGCATTCG TCAGCNNTGC GCTCGAAGGT 150 
CAACAAGAAT GTCGGGGTCT ACGCGGTGAA A 181 
5 (2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
io (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
15 (D) OTHER INFORMATION: AciI#2-872 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 18 

AGGTKACGGT GGCAGCGCCG ACCCCGTCGA ATCGGWTCGA AGAAYGCTCC 50 
GKACACGCCA GCTGCGTCCG YGCCGATGCC GACCTGCCAC CCGTG 95 
(2) INFORMATION FOR SEQ ID NO : 19 
20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-884d 
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 19 

AKCGGTCACC KACGGGCCGG CCACCGATGC GATTGTCAAC GGATTCCAAG 5 0 
TGGTTGYGCA TGCGC 65 

(2) INFORMATION FOR SEQ ID NO: 20 
(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 156 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
5 (D) OTHER INFORMATION: Acil#2-884l 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 2 0 

TCTTCTACAA GGACGCCTTC GCCAAGCACC AGGAGCTGTT CGACGACTTG 50 
GNCGTCAACG TCAACAATGG CTTGTCCGAT CTGTACRAGC AAGWTCGAGT 100 
CGCTGCCGNB CGCAACGCGA CGAGATCATC GAGGACCTAC ACCGTTGCCA 150 
10 CGAACA 156 
(2) INFORMATION FOR SEQ ID NO : 21 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 3 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
20 (ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-894l 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 21 

ATNCCGTTCC ACTNCCGCGG CAGCAGCTGG NTTTGCGCAC ACGGTGACCC 50 
AGTGGCGNTT GGTGGGGCCT CGCTGACGGC GAGTNTGGNC GAGCGTCCTC 100 
25 GGTCGGTGNC CTNTCNTCCC GCC 123 
(2) INFORMATION FOR SEQ ID NO: 22 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 6 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-898 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 22 
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UGCj I CWHKCA 


ANTTGATGBC 


NGCGCGCAAG 


GCCGNCATGG 


TNGAGATGCC 


50 


AAC C ACACCA 


CCGGCTGGNT 


CCGCATGGAC 


TTCGTGNTTS 


CCAGTCGCNG 


100 


LLl OA I TGGG 


TGNCGCACCG 


ACNNCCTNCA 


CCGAGACCSG 


TGGCTC-SGA 


150 


AW C T CG AC 


ATCAATKCAN 


CCGGAGNAGN 


ANGCTGACCN 


AACATNCGCT 


200 


CGACCGC 


GGATGTCNAT 


CGAGNACGST 


GCCAAGSCGC 


TGCAGCTGGA 


250 


INCTCGAGCG 


CGCCATGGAG 


TNATRTGCGS 


CCGACGAATN 


QGTCGAGGTG 


300 


ACCCCGGAGA 


NTCGTGCGGA 


TSCGCRAAGT 


CGAGCTGGCC 


GGCCNGCCGC 


350 


CCGGGCTNMG 


CAGCCGGGCG 


CGCACCNAAG 


GCGCGTGGCN 


TAGCANACTT 


400 


GGCGNGCTGG 


CCGCGCGAGC 


GTANACNGCC 


ACTGCGAAAN 


TCCANGCCCG 


450 


GCTTTTCGCA 


GCCGGGTTNA 


CGCTCGTGGG 


GGTACTGGAT 


AGCCTGATGG 


500 


GCGTGCCCAG 


NCCCANGTCC 


GCCGCGTCTG 


TGTGACGGTC 


GGCGCGTTGG 


550 


TCGCGCTGGC 


GTGTATGGTG 


TTGGCCGGGT 


GCACGGTCAG 


CCCGCCGCCG 


600 


GCACCCCAGA 


GCASTGATAC 


GCCGCGCAGC 


ACACCG 




636 



(2) INFORMATION FOR SEQ ID NO: 23 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

2 ° (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-916 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 2 3 

CTTCCGGCGG GACAACAACA GGTCTCACCG GCGCCACACC CTGACACCTG 50 
ATCGCGTCTG CCGATCCCGG TCGGAGCACC CGGGTTCCAC CGCTGTGCCC 10 0 

ccc 

103 

(2) INFORMATION FOR SEQ ID NO: 24 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
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(D) OTHER INFORMATION: Acil#2-1014 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 24 

GCCACCGGTT CATCGCGTGG TGCTGGTCAC CGCCNGGAAN GCCTCAGCGG 50 
ATCCCCTGCT GCCACCGCCG CCTATCCCTG CCCCAGTCTC GGCGCCGGCA 100 
5 ACAGTCCCGY CCGTGCAGAA CCTCACGGCT NCTHCCGGGC GGGAGCAGCA 150 
ACAGGTTCTC ACCGGYGCCW NGYACCCGCA CCGATCGCGT CGCCGATTCC 2 00 
GGTCGGA 207 
(2) INFORMATION FOR SEQ ID NO: 2 5 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 2 04 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1025 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 25 

.20 TTNCGCANNC GTTCATCCAG GTCCACTGGT GTCGCANCTC TCNNTGATGC 50 
ACCGGTTCCG GATATATGTC NACATCNCCS TCSTCGTCCT GGTGCTGGTA 100 
CTNACGAACC TGATCGCGCA TTTCACCACA CCGTGNGCGA GCATCGCCAC 15 0 
CGTCCCGGCC GCCYGCGGTC GGACTGGTGA TCTTGGTKCG GAGTAGAGGC 2 00 
CTGG 2 04 

25 (2) INFORMATION FOR SEQ ID NO: 2 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 7 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
35 (D) OTHER INFORMATION: Acil#2-1035 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 26 

ATACCNGTCA TCCNGCACAT NGTCAACCTN GAGTCGGTNC TCACCTACGA 5 0 
GGCACGCCCG' AGATGCATCA CTGGTGCTCG RTCAGNCCTT CACGGCTTGG 100 
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10 



20 



CCGCCTTCCG GTAGGACCGT HGCATGCCCG TCTTCGGCGC CTCGGGTGTT 150 
CGGTCCTGGC TCTCGGGCTG CTGGCCNCTG CGCCCCACCC CGCACCGGGC 200 
CGGCTTC 2 07 

(2) INFORMATION FOR SEQ ID NO: 27 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 

(B) TYPE: nucleic acid 
{ C ) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1084 
15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 27 

YCNAGNCKCG TNATNGCSGN CKCATNTNAC NGGANCCNGG ATTNCSTACG 50 
CCACNGTGAT CGCGCTGGTN GCCGCGCTGG TGGCGCGTGT ACGTGCTCTC 100 
GTCCACCGGN AANTAAGCGC ACCATCGTGG GCTACTTCAC CTCTGCTGTC 150 
GGGCTCTATC CCGGTGACCA GGTCCGCGTC CTGGGCGTCC NGGTGGGTGA 200 
GATCGACATG ATCGAGCCGC GGTCGTCCGA CGTSAAGATC ACTATGTCGG 25 0 
TGTCCAAGGA CGTCAAGGTG CCCGTGSACG NTGCAGGCC 28 9 

(2) INFORMATION FOR SEQ ID NO: 2 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 198 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1089 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 28 

TTGNACCANG CCTATCGCAA GCCAATCACC TATGACACGC TGTGGCAGGC 50 
35 TGACACCGAT CCGCTGCCAG TCGTCTTCCC CATTGTGCAA GGTGAACTGA 100 
GCAANGCAGA CCGGACAACA GGTATCGATA GCGCCGAATG CCGGCTTGGA 150 
CCCGGTGAAT TATCAGAACT TYGCAGTCAC GAACGACGGG GTGATTTT 198 
(2) INFORMATION FOR SEQ ID NO: 2 9 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 9 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
10 (D) OTHER INFORMATION: Acil#2-1090 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 2 9 

TCACGANGGT RYNACMGCAA CWCGACCGCC ACGTCASGCC GCCGCGCACG 5 0 
AAGATCACCG TGCCTGCNCG ATGGGTCGTG AACGGAATAG AAYGCAGCGG 10 0 
TGAGGTCAAN YGCGAAGCCG GGAACCAAAT CCGGTGACCG CGTCGGCAT 14 9 
15 (2) INFORMATION FOR SEQ ID NO: 30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
25 (D) OTHER INFORMATION: Acil#2-1104 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 0 

GGACCCGCCA AGCATCAGCC GGTCAACAGC CGCCGCCGGT GGCCAAAGTT 50 
CGAGCAGCCG CCGGTATCGT GCTCGGCCCG GCTAGACCAA AAACTTTACG 100 
CCAGCGCCCG AAGCCACCCG ACTCCAAGGC CTCGGCCCGG TTGGGTTCGC 150 
30 ACATGGGTGA GTTCTATATG CCCTACC'CGG GCACCCGGTT CAACCAGGAA 2 00 
ACCGTCTCGC 

(2) INFORMATION FOR SEQ ID NO: 31 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 255 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 



210 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#3-9 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 31 

CAGNCCGCTG NCCCGGAACT GTTCCAGCAG CTACAAGACC TJLCGACAACG 50 
TNGCGCGTCA ACCTGCANTC GAGCGCAACC TCTCGGTGGC GCTCAACGAG 100 
TGTTCGCCGG CTTCAACCCG CTGGACCCGC GAAACCTCGA CGTGTCCCCG 150 
CTGCCTTCGC TGGCCAAGCG CGCCGCCGAC ATCCTGCGCC AGGACGTGGG 2 00 
10 CGGGCAGGTC GACATTTTCG ATGTCAATGT GCCCACCATC CAGTACGACC 2 50 
AGAGC 2 55 

(2) INFORMATION FOR SEQ ID NO: 31 
U) SEQUENCE CHARACTERISTICS: ■ • 

(A) LENGTH: 164 ? 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-12 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 31 

AAYNCCNGGC CRTCGACGGT NCCGGTTCNC RCCACCGGTC TATATCCACC 50 
25 CGGGTCNRCA TTMANANTGA NTMNCCGCCG GTGCGGCCGT CGAGCGTGAC 100 
CTGGCATCCC CTGAGACGCT GCTGGGTTGC CCCGGGGAGN TCGAMANTCG 150 
GGCATCGCAC CATC 164 
(2) INFORMATION FOR SEQ ID NO: 32 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 23 7 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 2 

ACGGACGGCA ACGGGATGCG ACCCGATCCC ACCGGTCGCC ACGAGGGACG 50 
CTACTTCGTC GCCGGGCAGC CGANCCGACC GTCNGTTCNG CGANGGCGAC 100 
NGCCGAAGCC GTTGACCCAC NTTGGTCAGC AGCAGCTGGA TSAGTCAGGT 150 
5 GCCGTTGGTG TTTCGCCGTC AGCGGTGTCG GGGTGGGTGC GTTCTGGGCA 2 00 
CCGTCGACTG TGGTGGGCGC TNGCGGGCGN TGGTGGC '~ 23 7 

(2) INFORMATION FOR SEQ ID NO : 3 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 
io (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#3-47 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 3 

CNGATNGCTC GGNCTNCGGT ACCNAACTCG NAACTCGCGC CCWYGCGNAC 50 
20 GCAGGNCCGC GGTTGGCACC ACCAGCGACA TCAATCANGC AGGWKNCCCG 10 0 
CCACGTTGCA AGACGGCGGC AATCTTCGCC TGTCGCTCAC CGACTTTCCG 150 
CCCAACTTCA ACATCTTGCA CATCGACGGC AACAACGCCG AGGTCGCGGC 20 0 
GATGATGAAA GCCACCTTGC CGCGCGCGTT CATCATCGGA CCGGACGGCT 2 50 
CGNACGNACG GTCGACACCA ACTACTTCAC CAGCATCGAG CTGACCAGGA 3 00 
25 CCGCCCCGCA GGTGGTCACC TACACCATCA ATCCCGAGGC GGTGTGGTCC 3 50 
GACGGGACCC CGATCACCTG GCCG 374 
(2) INFORMATION FOR SEQ ID NO: 34 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#3-78 (overlaps with Acil#3- 
167) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 34 

GAGAACTCCG GGCCGANTTT TGGACA 26 
(2) INFORMATION FOR SEQ ID NO : 3 5 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 2 04 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-133 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 5 

15 TGTCGGGTNA RNGTTCGCGT CCATGATTGC TCTTGCAACG CTGTTGACGC 5 0 
TTATCAATCA AGTCGTCGGC ACTCCGTATA TTCCCGGTGG CGATTCTCCC 100 
GCCGGGACCG ACTGCTCGGA GCTGGCTTCG TGGGTATCGA ATGCGGCGAC 150 
GGCCAGGCCG GTTTTCGGAG ATAGGTTCAA CACCGGCAAC GAGGAAGCGC 200 
CTTG 204 
20 (2) INFORMATION FOR SEQ ID NO : 3 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 312 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
30 (D) OTHER INFORMATION: Acil#3-134 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 6 

CANNTTAGAC TGTCGTGACA TATCNCNNTN TACNCNTGGN ACGGCCATNA 5 0 
TTGGATAATN CGTGATAANC ACCACAAGAA TNATTCCTAT GNATATTGTC 10 0 
GGTACGTTCG CGNCCATGAT TNGCTCTTGC AACGCTGTTG ACGCTTATCA 15 0 
35 ATCAAGTCGT CGNCACTCCG TATATTCCCG GTGNCGATTC TCCCGCCGGG 200 
ACCGACTGCT CRGAGCTGGC TTCGTGGGTA TCGAATGCGS CGACGSCCAG 25 0 
GCCGGTTTTC GSAGATAGGT TCAACACCGG CAACGAGGAA GCGCCTTGGC 3 00 
GGCTCGGGGC TN 312 
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(2) INFORMATION FOR SEQ ID NO: 37 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
10 (ix) FEATURE: 



(D) OTHER INFORMATION: Acil#3-166 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 7 








AGGCCAATCG 


NTGATGCGAC 


TCGAACGGGT 


TCGGCGCCGA 


TGACTGTTTC 


50 




GCGAAGTTCA 


TCAGCACCCT 


CGTTGGCGCG 


AAGGGCACGA 


CGGTGTACCG 


100 


15 


GWWRYSAMKA 


CRCYGCYATG 


AGTYTCTGCS 


TGTATTGCGG 


TGCSGAGCTT 


150 




GCCGACCCGA 


CCAGGTGCGG 


KGCGTGNCGG 


CSCAKACWAG 


ATTGGTTCAA 


200 




CCTGGCNATC 


GGACCNACGA 


CGCCGACGGT 


CGGCGCCGCG 


ACGACGGCAN 


250 




ACGGNATNGC 


GACCCGANTC 


CNYACCNGGT 


CGCCACGAGG 


GACGNCTACT 


300 




TCGTCGCCNG 


GCAGCCGACC 


GANCTCGTTN 


NNCGCGASGN 


CGACGCCGAA 


350 


20 


GCCGTTGACC 


CACTTGGTCA 


GCAGCAGCTG 


GNNATCANGN 


TCANGGTGCC 


400 




GTTNNGGTGT 


TTCGCCGTCA 


GCGGTGTCGG 


GGTGGGTGCG 


TTCTGGGCAC 


450 




CGTCGACTGT 


GGTGGGCGCT 


TGCGGGCGTG 


GTGGCGTTTC 


TCGGGCTGGT 


500 




GGGAGCCGGT 


GTCGTCGGGA 


CGCTGTTCCT 


GAATCGAGAC 


CGGGAGTCCA ■ 


550 




TCGACGACAA 


GTACCTCGCN 


CCTTGAGGCG 


GTCCGGACTC 


ACCGGTGAGT 


600 


25 


TCAACTCCGA 


CGCGAACGCC 


ATCGCCCGCS 


GCAAGCAGGT 


GTGCCGCCAG 


650 




TTGCANASAC 


GGTGGCGAAC 


AGCNSA 






676 



(2) INFORMATION FOR SEQ ID NO: 3 8 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 853 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-167 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 38 
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GTGNGCGCGC 


CNTCGAGCAN 


GTCTTGGCNG 


CGANCCCGAB 


ACAANTGATT 


50 


CCCGACATCC 


GGTACACACC 


GAACCCCNAA 


NCGATGCGCC 


NGGCGGCCCG 


100 


CTGGTAGAAA 


GGGGAAATCG 


CCAGTGCTGA 


CTCGCKTCAT 


CCGACGCCAG 


150 


TTGAKCCKTT 


TKGCGAKCGT 


CKCCGTAGTG 


GCAATCGTCG 


TATTGGGCTG 


200 


GTACTACCTG 


CGAATTCCGA 


GTCTGGTGGG 


TNGTCGSGCA 


GTACACCTTG 


250 


AAGGCCGACT 


TGCCCGNATC 


GGGTGGCCTG 


TATCCGACGG 


CCAATGTGAC 


300 


CTACCGCGGT 


ATCACCATTG 


GCAAGGTTAC 


TGCCGTCGAG 


SCCACCGACC 


350 


AGGGCNGCAC 


GANGTGACGA 


TGAGCATCGC 


CAGNCAACTA 


SAAAATCSCC 


400 


GTCGATGCCT 


NCGGCGAACG 


TGCATTCGGN 


GTCAGCGGTN 


GGCGAGCAGT 


450 


ACATCGACCT 


NGTGTCCACC 


GGTGCTCCGG 


GTNAAATACT 


TCTCCTCCGG 


500 


ACAGACCATC 


ACCAANGGCA 


CCGTTCCCAG 


TGAGATCGGG 


CCGGCGCTGG 


550 


ACAANTCCSA 


ATCNGCGGGT 


TGGCCGCATT 


NGCCCACGGA 


GAAGATCGGC 


600 


TTGCTGCTCG 


ACGAGACNGC" 


GCAAGCGGTG 


GGTGGGCTGG 


GACCCGCGNN" 


650 


TTGCAACGGT 


TGGTCGATTC 


CACTCAAGCG 


ATCGTCGGTG 


ACTTCAAAAC . 


700 


CAACATTGGC 


GACGTCAACG 


ACATCATCGA 


GAACTCCGGG 


CCGATTTTGG 


750 


ACAGCCAGGT 


CAACACGGGT 


GATCAGATCG 


ACGCTGGGCG 


CGCAAATTGA 


800 


ACAATSTGGC 


CGCACAGACC 


GCNGACCAGG 


GAKCAGAACG 


TGCGAAGCAT 


850 


CCT 










853 



(2) INFORMATION FOR SEQ ID NO: 3 9 
20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 09 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#3-204 
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 9 

GCGGTTGGCA CCACCAGCGA -AATCAGCAG GNDCCCGCCA CGTTGCAAGA 5 0 
CGGCGGCAAT CTTCGCCTGT CGCTCACCGA CTTTCCGCCC AACTTCAACA 100 
TCTTGCACAT CGACGGCAAB AABGCCGAGG TCGCGGCGAT GATGAAAGCC 150 
ACCTTGCCGC GCGCGTTCAT CATCGGACCG GACGGCTCGA CGACGGTCGA 2 00 
35 CACCAACTA 209 
(2) INFORMATION FOR SEQ ID NO : 4 0 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 6 
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(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#3-206 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 0 

10 AGATCGTCAG TGAGCAGAAC CCCGCCAAAC CGGCCGCCCG AGGTGTTGTT 50 
CSAGGGCTGA AGNCNCTGCT CGCGACGGTC GCTGCTGGCC GTCGTCGGGA 10 0 
TCGGGCTTGG CTCGCGCTGT ACTTCACGCC GGCGATGTCG NCCCGCGAGA 15 0 
TCGTGTATCA TCGGGT 166 
(2) INFORMATION FOR SEQ ID NO: 41 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-214 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 2 

CCAGNTCCTC NNATATCGAC ACCCTCNACN AAGACCGCTT CGCGAGATCA 50 
ACNCTCAGAT ATNCNNACTA TCNCCNNTNC ACGCACACCT CAACATNANA 100 
NAATNGAACT ATNGNCTTCG CCTCACCACC AAGGTTCAGG TTANCGGCTG 15 0 
NCGTTTKCTC TKCGCCGGCT CGAACACGCC ATCGTGCGCC GGKACACCCG 2 00 
30 GATGTTTGAC GACCCGCTGC A 221 
(2) INFORMATION FOR SEQ ID NO: 4 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: t double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-281 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 43 

5 CGGYCCGNNC AAYYYGNCGC GCHNCGGYGY AGAGGTCGNY AAGGTCGCCA 50 
AGGTAACGCT GATCGAYGGG NACANGCAAG TATTGGTGNA CTTCACCGTG 100 
GHTHGCTHGC TGTYAGC 117 
(2) INFORMATION FOR SEQ ID NO: 44 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 385 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: BsaHI#l-21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 44 

GAACCTCCTC GCCCGCGCTT • GGCCTAGCAT TAATCGACTG GCACGACAGT 50 
TGCCCGACTG GGTACACGGC ATGGACGCAA CGCGAATGAA TGTGAGTTAG 100 
CTCACTCATT AGGCACCCCA GGCGTTGACA CTTTATGCTT CCGGCTCGTG 150 
TAGTTGTGTG GGAATTGTGG AG CGG AT AAC AATTTCGACG ACGAGGAAAC 2 00 
AGCTGTAGAC ATGGATTGAC GAATTTGAAT ACGACTCACT ATAGGAATTC 250 
25 GAGCTCGGTA CCCGGGGATC CTCTAGAGTC CTTCGCCGCG GGTCGCCACC 3 00 
ATCAGGGCCA GTGCGATCGC AAGCGCGGGG TACCGGGCGC CATAGTCTTC 3 50 
AGCATCGGCG TGTTGACCGC AG AG AC CGG A CGGGG 335 
(2) INFORMATION FOR SEQ ID NO: 4 5 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 2 85 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : genomic DNA 
35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-12 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 5 






CCCGCAGCAG 


TACCCGCAGN 


CCCACACCCG 


CTATNCGCAG 


CCCGAACAGT 


50 


TCGGTGCACA 


GCCCACCCNA 


GCTCGGCGTG 


CCCGGTCAGT 


ACGGCCAATA 


100 


CCAGCAGCCG 


GGCCAATATG 


NCCAGCCGGN 


ACAGTNACGN 


CCAGCCCGGC 


150 


CAGTACGCNA 


CCGCCCGGTC 


AGTACCCCGG 


GCAATACGGC 


CCGTATGNCC 


200 


AGTCGGGTCA 


GGGGTCGAAG 


CGTTCGGTTG 


CGGTGATCGG 


CGGCGTGATC 


250 


GCCGTGATGG 


CCGTGCTGTT 


CATCGGCGCG 


GTTCT 




285 



(2) INFORMATION FOR SEQ ID NO: 4 6 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 186 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-142 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 6 

20 GCNCGTGNCC GTGCCGCCCG GTTGAACGTG AGCNGCTGNC NATNGCCCCA 5 0 
GCCGAGACGA GAACGTCCCC GAGGAGTATG CAGACTGGGA AGACGCCGAA 10 0 
GACTATGACG ACTATGACGA CTATGAGGCC GCAGACCAGG AGGCCGCACG 150 
GTCGGCATCC TGGCGACGGC GGTTGCGGGT NCGGTT 186 
(2) INFORMATION FOR SEQ ID NO: 4 7 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 02 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-144 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 7 

GTCGCTGAAT GTGTTGTCGG AGACCGTGAT CAGACCTATC CGCACCTGAG 5 0 
CGCCGCCTCC ACGGGTGGCT AAGTTCTCCG ACACCATCGG CAAGCGCGAC 10 0 
GAG C AG ACT C ANGCACCTAC TAGCCCAGGC CAACCAGGTG GCCAGCATCC 15 0 
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TGGGTGATCG CAGTGAGCAG GTCGACCGCC TATTGGTCAA CGCTAAGACC 200 

CTGATCGCCG CGTTNCAACR GASNGCGCCG CGCGGTCGAC GCCCTGCTGG 250 

GGAACATCTC CGCTTTCTCG CCCAGGYGCA AAACCTTCAT SAACGACAAN 3 00 

CCGAACCTGA ACCATGTGCT CGAGCNGCGC ATCCTSACSA CCTGTTGGTS 3 50 

5 GACSGCAAGG AGGATTTGGC TGAAANCCTN ACGATSTTGG GCAGAKTCAG 4 00 

CG ... 402 

(2) INFORMATION FOR SEQ ID NO: 4 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 68 
o (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 



(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-200 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 8 






AGNCCGTGCA 


CTGGAANCTT 


CGGCTCAGWT 


GTCTCCGATG 


TGGACGGCAA 


50 


SGCTGATGAT 


CTCCCGGTTG 


GAAGTCGANT 


CGATKASAAA 


TGGCTTGGCG 


100 


GCTGGTGGTG 


TTCGATGCCT 


GGCACCRACT 


GGCBACGATC 


NSCGCCTGGN 


150 


CGCGATCGGC 


GCTTAGCTCG 


GCTGGNNCCC 


TGTGGTGGGT 


TTCGACGTGC 


200 


TCGGTGTTGG 


TGCTGCTGGT 


GGTCGAAGGT 


GTGGCAATCA 


ACGTTCTGGC 


250 


TGTTGCGTCG 


TGATTCGGTA 


ACCGTCGGTA 


CCGACGACGA 


TGCGCCCGGG 


300 


CTGCGACTGG 


CCGTTGTCTT 


CCTGTGCNNG 


CCGCCGCGAT 


CTCGGCGGCN 


350 


GTGGTGACTG 


GGTACCTGCG 


CTGGACGACA 


CCGGACCGCG 


ACTTCAATCG 


400 


GGATTCCCGG 


GAAGTGGTGC 


ATCTTGCCAC 


GGGGATGGCC 


GAGACGGTCG 


450 


CGTCATTCTC 


CCCGAGCG 








468 



(2) INFORMATION FOR SEQ ID NO: 4 9 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 417 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE : 
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(D) OTHER INFORMATION: HinPI#2-23 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 9 

GTCCAAGGCC GTAGCCCACC TCCTGGAAGT CGTACCACGT CGACTCGACC 50 
AGGACGGCTG CAGTCAGCAC TTCGTCAACC CGCGATCATC AACGTGCACC 100 
5 TACGGCAGTG TGACGCACCC CGGACCATCG CACTGGCCGG GGTTCACACG 150 
CCGAACACTG CTGACCGCAC TGGATCTGCT GGTCGCATGC ACCACTTCAA 2 00 
GGTGGTGACG TACCTCAAAA TGGGTTTCCC GTTGTCCACC GAGGAAGTCC 250 
CGCTGATTCA TGGGCAATAA CGCTCCCTAT CCGCAGTGTC ACCAGTGGGT 300 
GCAAGCGGCG ATGGCCAAGT TGGTCGCTGA CCACCCCGAC TACGTTTTCA 3 50 
10 CAACCTCGAC TCGACCGTGG AACATCAAAC CCGGCGATGT GATGCCAGCA 4 00 
ACCTATGTCG GGATCTG 417 

(2) INFORMATION FOR SEQ ID NO: 50 
.(i) SEQUENCE CHARACTERISTICS: • • 

(A) LENGTH: 279 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#2-143 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 50 

CGGTCGAGCC GATGAACGTC TGCAGTTCAC CGCAACCACG CTCAGCGGTG 50 
25 CTCCCTTCGA TGCGCAAGCC TGCAAGGCAA TGCCGCGGTG TTGTGGTTCT 100 
GGACGCCGTG GTGCCCGTTC TGCAACTGTC AGAAGCCCCC AGCCGCAGCC 150 
AGGTAGCGGC CGCTAATCCG GCGGTCACCT TCGTCGGAAT CGCCACCCGC 2 00 
GCCGACGTCG GGGCGATGCA GAGCTTTGTC TCGAAGTACA ACCTGAATTT 2 50 
CACCAACCTC AATGACGCCG ATGGTGTGA 2 79 

30 (2) INFORMATION FOR SEQ ID NO: 51 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#2-145 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 51 






CGGCCCGGCG 


GCGCCCTGGT 


GAAGCTTGGA 


GAATGGGTGA 


GCGCAGCTGC 


50 


CCACCACACG 


GGACCGGTGC 


GGACGCGSTG 


ACGCGCCTGG 


TGGTCAGCAN 


100 


CNTGGCCGGT 


CTGCTGTTGT 


ATGCCAGCTT 


CCCGCCGCGC 


AACTGCTGGT 


150 


GGCGGCGGTG 


GTTGGGCTNC 


GCATTGCTGG 


CCTGGGTGCT 


GACCCACCGC 


200 


GCGACGACAC 


CGGTGGGTGG 


GCTGGGCTAC 


GGCCTGCTAT 


TCGGCCTGGT 


250 


GTTCTACGTC 


TCGTTGTTGC 


CGTGGATCGG 


CGAGCTGGTG 


CNCCGGGCCC 


300 


TGGTTGGCAC 


TGNCGACGAC 


GTGC 






324 



10 

(2) ' INFORMATION FOR SEQ ID NO: 52 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 9 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
20 (ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#2-150 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 52 

CCAGGCTAGC ACGTATGCTC CGGCTCGTTG TGTGTGGAAT GTGAGCGGAT . 5 0 
GACANKNCAC ACAGGADAYA GCTATGACNA TGATTACGCC AAGCTATTTA 100 
25 GGTGABACTA TAGAATAYTC AAGCTATGCA TCCAAYGCGT TGGGAGCTCT 150 
YCCATATGGT CGACCTGCAY GCGGCCGCAC TAGTGATTST THGCGCCGGC 2 00 
NYGCWGCGGC NYAYGACCGC YAAYACCAC 22 9 

(2) INFORMATION FOR SEQ ID NO: 53 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 293 

(B) TYPE: nucleic acid. 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#3-28 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 53 

CCACACAACA CAAATCTACG TCGTAATGCA GTCGTAAGTC CATCCGACGT 50 
CGATGGCAAG GACAGCACCC GACGGCCAAC GGCATATACA TCGTCGGCTC 100 
GCCGGTCACA AGCACATCAT CATGGACTCG TCCACTACGG CGTACCCGTC 150 
5 AACTCGCCCA ACGGATATCG CACCGATGTC GACTGGCCAC CCAGATCTCC 2 00 
TACAGCGGTG TCTTCGTGCA CTCAGCGCCG TGGTCGGTGG GGGCTCAGGG 250 
CCACACCAAC ACCAGCCATG GCTGCCTGAA CGTCAGCCCG AGC 2 93 

(2) INFORMATION FOR SEQ ID NO: 54 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 816 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
15 (vi) ORIGINAL SOURCE : 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#3-30 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 54 

20 CGNCGYCGSC GNGCSCTAYC GGTGCGGGAG GGTACAYCCA AGCANTCCGG 50 
GACCGGCCGT CYCGCYGGGA ACGCCGTGCT CCTACAYACC GGCGRCGGGC 10 0 
GCGTTGCCAC GSCCCGACAC CCCACTACCC NGNCGCGGGC GCCACCRTTG 15 0 
GCCCGTTNMG GTGGACCCGA NCTTCCCGGC ACCGCTCGAT GTCCAGCCGT 20 0 
CGCCGCCTAA TCCCGATGGG CCGCMGCCGA CKCCGGGCAT CCTAAGTGCT 25 0 
25 GGGCGGCCGG GCGAGCCGGN TCCGGNTGTT CCGGCATACC GWTGCCSYTG 30 0 
CCGNCGAAC- TGCACGCACC CAACCGCTTG AGCCGTTTCC TGACGGGACG 3 50 
GGAGGTAGCA ACCAATGAGC ACCATCTTCG AYATCCGSAG CCTGCKACTN 40 0 
GYCGAWACTG TCTNGCAAAG GTAGTGGTCG TCGGCGGGTT GGTGGTGGTC 45 0 
TTGGCGGTCG TRGCCGNCTG NCRGCCGGCG CGCRGCTCTA CCGGAAACTG 50 0 
30 ACTANACTAC CGTGGTCGCR TATTTTCTST GAGGCGCTCG CGCTGTACCC 55 0 
AGGAGASAAA GTCCAGATCA TGGGTGTGCG GGTCGGTTCT ATCGACAAGA 60 0 
TCGAGCCGGC CGGCGACAAG ATGCGAGTCA CGTTGCACTA NCAGCAASAA 65 0 
ATACCAGGTG CCGGCCACGC TACCGNYGNW CGMTCCTCAA CCCCAGCCTG 70 0 
GTGGCCTCGC GCACCATCCA GCTGTCACCN NCGTACACCG GCGGCCCGGT 75 0 
35 CTTGCAAGAC GGCGCGGTGA TSCCAATCGA GCGCACCCAG RTGCCCGTCG 80 0 
AGTGGGATCA GTTGCG 816 

(2) INFORMATION FOR SEQ ID NO: 5 5 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 117 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#3-34 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 55 

CAGCCACCTC GTTCGCCGCC GACATCGACT ATCAGCCGAC CCGGCCACTG 5 0 
CTGACCTGAT CGCCAACAGC TGGAGGCCCT ACCGGCTGCA GTTCAATTCA 100 
CCCGCTGCGG GTCGGCG 117 
(2) INFORMATION FOR SEQ ID NO: 5 6 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 2 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#3-41 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 56 

AGGTGTCGTG CTTCATGCCT GGCGCCCAAT CCAGTTTCTA CACCGACTGG 50 
TATCACCCTT CGCAGACAAA CGGCCAGAAC TACACCTACA AGTGGGAGAC 100 
CTTCCTTACC ACACAGATGC CCGCCTGGCT ACAGGCCAAC AAGGCGTGTC 15 0 
CCCCACAGGC AACGCGGCGG TGGGTCTTTC GATCTCGGGC GGTTCCGCGC 2 00 
30 TGACCCTGGC CGCGTACTAC CCGCAGCAGT TCCCGTACGC CG 24 2 

(2) INFORMATION FOR SEQ ID NO: 57 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HpaII#l-3 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 57 

TGCTGCAGAT AGCCAAGGAT CCAGTCGTGA TTGATATCAC GTCTTTCCAG 50 
TGAATTGAAG TTTGGCTATC AAAGGGTGAA CTTSAAAGAC GGCACACTGA 100 
CCTATGATGG TGCCGATCCG GAGCGCAAGC GCGCCATGGT TTCCAAGCCA 150 
GAGGGCAAGN ACAAGTACGG CGAAGAGCTG GTCGGGCCGG TGCGCGGGCT 200 
CAACACCGAG GACCGGACCT ACCTGAATTT CGACAAGGTC GAGACGTTGG 250 
GCAGCAGCAC CGAAATTCCG GTGCTGGTGC TGCCGTCCGG CAAGCGTATC 3 00 
GAATTCCAAA TGGCCTCAGC CGATGTGATA CACGCATTCT 340 
(2) INFORMATION FOR SEQ ID NO: 58 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HpaII#l-8 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 58 

CNGACTCCAA CNAGTGCGNT CAANCNGNTG TNCCNGACAA GAAGGTTCCT 50 
25 ACATCCGCAA NTCGGTGNAA NGCCACTGTG GATGCCTACG ACGGAACGGT 100 
CACGCTGTAC CAACAGGACG NAAAAGG AT C CGGTGCTCAA GGCCTGGATG 150 
CAGGTCTTCC CCGGCACGGT AAAGCCTAAG AGCGACATTG CGCCGGAGCT 200 
TGCCGAGCAN CTGCGGTATC CCGAGGACCT GTTCAAGGTG CAGCGCATGT 25 0 
TGTTGGCCAA AT 2 62 

30 (2) INFORMATION FOR SEQ ID NO: 5 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 241 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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(ix) FEATURE: 

(D) OTHER INFORMATION: HpaII#l-10 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 59 

CCACCANNNA ACRRCACAGC TCCGGCCRRC CGTNCGCAGG CCACCCGCAN 50 
5 CGTAGTGCTC AAATTCTTCC AGGACCTCGG TGGGGYACAT CCGTCCACCT 100 
GGTACAAGGC CTTCAACTAC AACCTCGCGA CCTCGCAGCC CATCACCTTC 150 
GACACGTTGT TCGTGCCCGG CACCACGCCA CTGGACAGCA TCTACCCCAT 200 
CGTTCAGCGC GAGCTGGCAC GTCAGACCGG TTTCGGTGCC G 241 
(2) INFORMATION FOR SEQ ID NO : 60 
10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HpaII#l-13 
20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 60 

CCGGCGGATC TGCGTGACGA NTGTATNCCA CGGNACTACC CGCGGTCCTT 5 0 
CCTCNANTNC CGCCGGNCCA GNCGCAGNCT NCNGATGTCC NGCTATAACC 10 0 
TGCGCGATCG CCGCCGGGCT GCCCGACAAC ACGGTGNGCG CCGCCGCTGC 15 0 
TTCCGCCAAT TCTGGGTGNC GGCATNCCGG CAGCGCCCGG CCCAGCACTG 200 
25 AGAGGGGGAC GTTGATGCGG TGGCCGACGG CGTGGCTGCT GGC 24 3 

(2) INFORMATION FOR SEQ ID NO: 61 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 4 6 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-825 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 61 
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GCGCTGNCAT 


TCGNACTTCG 


GACNGCGTTN 


GCGGTGGTGC 


TGATCATGAA 


50 




NCTACGACGG 


CGCCACCGGC 


AGCTTCCCGT 


CATGGGTGCT 


CTATCCCTGT 


100 




GCGCTGGCCA 


TGATGGTGTT 


CTCGAATKCG 


TTCAGCGTNC 


TGCGCAGCGC 


150 




AGTGANACCG 


AGGGTGATGC 


CGCCAACCAT 


CGACTTGGTC 


CGGGTCAACT 


200 


5 


CACGGCTGAC 


CGTGTTCGGC 


CTGCTCGGCG 


GCACCATCGC 


TGGTGGCGCG 


250 




ATTGCGGCCG 


GAGTCGAATT 


CGTCTGCACC 


CACCTGTTCC 


AGGTGCCGGG 


300 




CGCGTTGTTC 


GTCGTCGTCG 


CGATCACCAT 


CNNTNNNGCT 


TCGCTGTCGA 


350 




TNCNCATTCC 


GCGCTGGGTC 


GAGGTGACCA 


GCGGTGAGGT 


CCCGGCCACA 


400 




TTGAGCTACC 


ACCGGGATAG 


GGNCAGACTA 


CGGCGACNGC 


TGGCCGGAGG 


450 


10 


AAGTCAAGAA 


CCTCGGCGGA 


ACACTCCGAC 


AACCGTTGGG 


CCGCAACATC 


500 




ATTACCTCCC 


TGTGGGGTAA 


CTGCACCATC 


AAGGTG ATGG - 


TCGGCTTTCT 


550 




GTTCTTGTAT 


CCGGCGTTTG 


TCGCCAAGGC 


GCACGAAGCC 


AACGGGTGGG 


600 




TGCAATTGGG 


CATGCTGGGC 


CTGATCGGCG 


CGGCGGCCGC 


GGTCGGCAAC 


650 




TTCGCCGGCA 


ATTTCACCAG 


CGCACGCCTG 


CAGCTAGGCA 


GGCCAGCTGT 


700 


15 


GCKGGTNGTG 


CGCTGCACCG 


TGCTAGTTAC 


CGTGTTAGCC 


ATCGCGGCCG 


750 




CGGTGGCCGG 


CAGCCTGGCA 


GCGACAGCNA 


TTGCCACCCT 


GATCACGGCA 


800 




GGGTCCAGTG 


CCATTGCTAA 


AGCCTCGCTG 


GACGCCTCGT 


TGCAGCACGA 


850 




CCTGCCCGAG 


GAGTCGCGGG 


CATCGGGGTT 


TGGGCGTTCC 


GAGTCGACTC 


900 




TTCAGCTGGC 


CTGGGTGCTG 


GGCGGCGCGG 


TGGGCGTGTT 


GGTGTACACC 


950 


20 


GAGCTGTGGG 


TGGGCTTCAC 


TGCGGTGAGC 


GCGCTGCTGA 


TCCTGGGTCT 


1000 




GGCTCAGACC 


ATCGTCAGCT 


TCCGCGGCGA 


TTCGCTGATC 


CCTGGCCTGG 


1050 




GCGGTAATCG 


GCCCGTGATG 


GCCGAGCAAG 


AAACCACCCG 


TCGTGGTGCG 


1100 




GCGGTGGCGC 


CGNAGTGAAG 


CGCGGTGTCG 


CAACGCTGCC 


GGTGATCCTG 


1150 




GTGATTCTGC 


TCTCGGTGGC 


GGCCGGGGCC 


GGTGCATGGC 


TGCTAGTACG 


1200 


25 


CGGACACGGT 


CCGCAGCAAC 


CCGAGATCAG 


CGCTTACTCG 


CACGGGCACC 


1250 




TGACCCGCGT 


GGGGCCCTAT 


TTGTACTGCA 


ACGTGGTCGA 


CCTCGACGAC 


1300 




TGTCAGACCC 


CGCANGCGCA 


GGGCGAATTG 


CCGGTAAGCG 


AACGCTATCC 


1350 




CGTGCAGCTC 


TCGGTACCCG 


AAGTCATTTC 


CCGGGCGCCG 


TGGCGTTTGC 


1400 




TGCAGGTATA 


CCAGGACCCC 


GCCAACACCA 


CCAGCACCTT 


GTTTCGGCCG 


1450 


30 


GACACCCGGT 


TGGCGGTCAC 


CATCCCCACT 


GTCGACCCGC 


AGCGCGGGCG 


1500 




GCTGACCGGG 


ATTGTCGTGC 


AGTTGCTGAC 


GTTGGTGGTC 


GACCACTCGG 


1550 




GTGAACTACG 


CGACGTNCGC 


ACGCGGAATG 


GTCGGTGCGC 


CTTATCTTTT 


1600 




GACGAGGCCG 


CGGCTCGACG 


NC-CCTTAAG 


CGCGGTCGGC 


GCCAACGGTC 


1650 




CGAAGAGCCG 


CCGACACCCG 


GGGCACATCG 


GCGCATCATG 


GAACTGTGCG 


1700 


35 


GATCGGAGTC 


GGGGTTTGCA 


CCACGCCCGA 


CGCGCGGCAG 


GCCGCGGTGG 


1750 




AGGCTGCGGG 


CCAGGCGCGC 


GACGAGCTGG 


CGGGTGAGGC 


GCCGTCGCTG 


1800 




GCGGTGTTGC 


TTGGATCGCG 


TGCACACACC 


GACCGGGCTG 


CCGACGTCCT 


1850 




GAGCGCGGTG 


CTGCAGATGA 


TCGACCCGCC 


CGCGCTTGTC 


GGTTGCATCG 


1900 
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CCCAGGCCAT CGTCGCCGGC CGCCACGAGA TCGAGGACGA GCCCGCGGTG 1950 
GTGGTGTGGC TGGCGTCCGG CTTGGCCGCC GAGACATTCC AGCTGGACTT 2000 
TGTCNGTACC GGCTCGGGTG CCCTGATCAC CGGTTATCGG TTCGACCGNA 2 050 
CCGCCCGGGA TCTGCATCTG CTGCTGCCGG ACCCGTACAC ATTCCCGTCG 210 0 

5 AACCTGCTCA TCGAGCACCC CAACACCGAC CTGCCGGGCA CCGCNGTCGT 2150 
GGGCGGCGNT GGTGAGCGGC GGGCGCCGGC GGGGCGACAC CCGGSTGTKC 2200 
CGCGATCACG ACGTGCTCAC CTCCGGMGTC GTCGGCGTGC GCCTGCSCGG 2250 
GATGCGCGGT GTMCCGGTCG TGTCGCAGGG TTGNCGGCCG ATCGGCTACC 230 0 
CATACATCGT CACCGGMGCG GACGG CAT AC TGRKCACCGA GCTCGG * 234 6 

10 (2) INFORMATION FOR SEQ ID NO : 62 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 841 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
20 (D) OTHER INFORMATION: Acil#435 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 62 






CGTTACCCGC 


TTTACACCAC 


CGCCAAGGCC 


AACCTGACCG 


CGCTCAGCAC 


50 


CGGGCTGTCC 


AGCTGTGCGA 


TGGCCGACGA 


CGTGCTGGNC 


NAGSCCNANS 


100 


CCAATGNCGG 


MMTGCTGCAA 


NCGGNTNCNG 


GCCANGCGTT 


CGGACCGGAC 


150 


GGACGCTGGN 


CGGTATCAGT 


CCNGTCGGCT 


TCAAANCCGA 


NGGCGTGGGC 


200 


GAGGACCTCA 


AGTCCGRRCC 


CGGTGGTCTC 


NAAACCCSGG 


CTNGTCAACT 


250 


CCGATNCGTC 


GCCCAACAAN 


CCCAACNGCC 


NGCCATCANC 


GACTCCKCNG 


300 


GCACCGCCNG 


AGGGAAGGGY 


CCGGNTCGGG 


ATTCAACGGG 


TTGGCRWCGC 


350 


GGCGCTGCCG 


TTCNGRATTG 


GAYCCGGCAN 


CGTACCCCGG 


TGATGGGCAG 


400 


CTNACGGGGA 


NGAACAACCY 


GSCCSSSACG 


GCCACCTCGG 


CCTGGTACCA 


450 


GTTACCGCCC 


CGCAGCCCGG 


ACCGGCCNGC 


TGGTGGTGGT 


TTCCNGCGGC 


500 


CGGCGCCATC 


TGGTCCTACA 


AGGAGGACGG 


CGATDTCATC 


TACGGCCANG 


550 


TCCCNTGAAA 


CTGCAGTGGG 


NCGTCACCGG 


CCCGGACGGC 


CGCANTCCAG 


600 


CCACTGGGGC 


AGGTATTTCC 


GANTCGACAN 


TCGGACCNGC 


AACNCCNGCG 


650 


TGGCGCAATC 


TGCGGTNTNT 


CCGCTGGCCT 


GGGCGCCGCC 


GGNANGCNCG 


700 


ACGTGGCGCG 


CATTGTCGCC 


TATGACCCGA 


ACCTGAGCCC 


TGAGCAATGG 


750 


TTCGCCTTCA 


CCCCGCCCCG 


GGTTCCGGTG 


CTGGAATCTC 


TGCAGCGGTT 


800 


GAKCGGGTCA 


GCGACACCGG 


TGTTGATGGA 


CATCGCGACC 


G 


841 
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(2) INFORMATION FOR SEQ ID NO : 63 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 471 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D) TOPOLOGY:' linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
10 (ix) FEATURE: 

(D) OTHER INFORMATION: Acil#l - 2/23/9 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 3 

GCCAGCCGTG ATCGGCTGAC CGGCAGTGAT CACCAACCTC AACGTGGTGC 50 
TGGGCCTCGC TGGCGCTCAC ACGATCGGTT GGACCAGCCG GTGACGT.CGC 100 

15 TATCAGCGTT GATTCACCGG CTCGCGCAAC GCAAGACCGA CATCTCCAAC 150 
GCCGTGGCCT ACACCAACGC GCCGCCGGCT CGGTCGCCGA TCTCTGTCGC 200 
AGGCTCGCGC CGTTGGCGAA GGTGGTTCGC GAGACCGATC GGGTGGCCGG 2 50 
CATCGCGGCC GCCGACCACG ACTACCTCGA CAATCTGCTC AACACGCTGC 3 00 
CGGACAAATA CCAGGCGCTG GTCCGCCAGG GTATGTACGG CGACTTCTTC 3 50 

20 GCCTTCTACC TGTGCGACGT CGTGCTCAAG GTCAACGGCA AGGGCGGCCA 4 00 
GCCGGTGTAC ATCAAGCTGG CCGGTCAGGA CATGCGGCGG TGCGCGCCGA 4 50 
AATGAAATCC TTCGCCGAAC G 4 71 

(2) INFORMATION FOR SEQ ID NO: 64 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 485 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Aci I#l - 2 2 9/2 64 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 64 

35 KGTCTCGCGN CCTTNACATC CGGTCGCCNN RCGGTNATCT GCCTGTGGAT 50 
GCCGTCCGGA NGTATNANCN AATGGCCANG AGTNCGTGAC NGCAGNTATG 100 
GNCKCGGNTA TAGTTCCGTT TTGCCCNGGA CTNGGNGCGT GAGGTGGAAC 150 
TAATGGCGGT GTCGGGTGAT ATTTCCGACG GCAAGNCGAC CATATAGGTG 2 00 
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GNATNCGACG GCAATAAACA CACGCTCTGG CCACGTTTCT TGGCGGGGAA 250 
AGGGGTGATG CTATCGGAGC CAATGGTATC GCGACAACAC TTGCAGATGC 300 
CGCCAAGGCC GATCACGCTA ATGACGGATT CGGGGCCACA AACGTTCCCC 350 
GTTCTGGCGG TTTTCTCTGA CTACACCTCA GATCAAGGTG TGATTTTGAT 4 00 
5 GGATCGCGCC AGTTATCGGG CCCATTGGCA GGATGATGAC GTGACGACCA 450 
TGTTTCTTTT TTTGGCNATN CGGGTGCGAA TAGCG 435 
(2) INFORMATION FOR SEQ ID NO : 65 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 9 
10 <B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#l-264A 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 5 






GGCGAGGTCA 


GTGAAGCCGA 


GGAAGCGGAA 


AGGAGCGCCC 


AATACGGAAC 


50 


CGCCTCTCCC 


CGCGCGTTGG 


CCGATTCATT 


AAATGCAGCT 


GGCACGACAG 


100 


GTTTCCCGAC 


TGGAAMGCGG 


GCAGTGAGCG 


CAASGCAATT 


AATGTGAGTT 


150 


AGCTCACTCA 


TTAGGCACCC 


CAGGCTTTAC 


ACTTTATGCT 


TCCGGCTCGT 


200 


ATGTTGTGTG 


GAATTGTGAG 


CGGATAACAA 


TTTCACACAG 


GAAACAGCTA 


250 


TGACATGATT 


ACGAATTTAA 


TACGACTCAC 


TATAGGGAAT 


TCGAGCTCGG 


300 


TACCCGGGGA 


TCCTCTAGAG 


TCGCTTCGGT 


TGGCGGCGAC 


CAGCAGTGGA 


350 


TCCACGGTGG 


CCGCCCGCGC 


GGCDTCATAC 


ACCGCCGCGG 


CCTCCTTGGC 


400 


CTGTGCGGCC 


SGCTTAGCGC 


GCGTGTTGCT 


GCCGTGCTTA 


GCCANCTGGC 


450 


ATAGGGGGCT 


GCCGCGCGC 








469 



(2) INFORMATION FOR SEQ ID NO: 66 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 90 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
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(D) OTHER INFORMATION: AciI#l-264C 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 66 

CNGGTTCGAC TGATCTAGCT GGGGCCAGAC CGGCACGAGG CGACAGTTAC 50 

CAGTACCTGA CAGACAGGCC GATCGAGCCA AACCGTAGTG AGGACGCAGG 100 

5 AGGAACAGGC AGATGCATCT AATGATACCC GCGGAGTATA TCTCCAACGT 150 

GATATATGAA GGTCCGCGTG CTGACTCATT GTATGCCGCC GAGCAGCGAT 200 

TGCGACAATT AGCTGACTCA GTTAGAACGA CTGCCGAGTC GCTCAACACC 250 

ACGCTCGACG AGCTGCACGA GAACTGGAAA GGTAGTTTCA , 2 90 

(2) INFORMATION FOR SEQ ID NO : 6 7 
10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1306 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#2-92 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 67- 

GTGATACAGG AGGCGCCAAC AGTGACACCT CGCGGGCCAG GTCGTTTGCA 5 0 

ACGCTTGTCG CAGTGCAGGC CTCAGCGCGG CTCCGGAGGG CCTGCCCGTG 100 

GTCTTCGACA GCTGGCGCTC GCAGCAATGC TGGGGGCATT GGCCGTCACC 150 

GTCAGTGGAT GCAGCTGGTC GGAAGCCCTG GGCATCGGTT GGCCGGAGGG 200 

25 CATTACCCCG GAGGCACACC TCAATCGAGA ACTGTGGATC GGGGCGGTGA 2 50 

TCGCCTCCCT GGCGGTTGGG GTAATCGTGT GGGGTCTCAT CTTCTGGTCC 3 00 

GCGGTATTTC ACCGGAAGAA GAACACCGAC ACTGAGTTGC CCCGCCAGTT 3 50 

CGGCTACAAC ATGCCGCTAG AGCTGGTTCT CACCGTCATA CCGTTCCTCA 40 0 

TCATCTCGGT GCTGTTTTAT TTCACCGTCG TGGTGCAGGA GAAGATGCTG 4 50 

30 CAGATAGCCA AGGATCCCGA GGTCGTGATT GATATCACGT CTTTCCAGTG 500 

GAATTGGAAG TTTGGCTATC AAAGGGTGAA CTTCAAAGAC GGCACACTGA 55 0 

CCTATGATGG TGCCGATCCG GAGCGCAAGC GCGCCATGGT TTCCAAGCCA 60 0 

GAGGGCAAGG ACAAGTACGG CGAAGAGCTG GTCGGGCCGG TGCGCGGGCT 6 50 

CAACACCGAG GACCGGACCT ACCTGAATTT CGACAAGGTC GAGACGTTGG 70 0 

35 GCACCAGCAC CGAAATTCCG GTGCTGGTGC TGCCGTCCGG CAAGCGTATC 750 

GAATTCCAAA TGGCCTCAGC CGATGTGATA CACGCATTCT GGGTGCCGGA 80 0 

GTTCTTGTTC AAGCGTGACG TGATGCCTAA CCCGGTGGCA AACAACTCGG 85 0 

TCAACGTCTT CCAGATCGAA GAAATCACCA AGACCGGAGC ATTCGTGGGC 90 0 
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CACTGCGCCG AGATGTGTGG CACGTATCAC TCGATGATGA ACTTCGAGGT 950 
CCGCGTCGTG ACCCCCAACG ATTTCAAGGC CTACCTGCAG CAACGCATCG 1000 
ACGGGAAKAC AAACGCCGAG GCCCTGCGGG CGATCAACCA GCCGCCCCTT 1050 
GCGGTGACCA CCCACCCGTT TGATACTCGC CGCGGTGAAT TGGCCCCGCA 1100 

5 GCCCGTAGGT TAGGACGCTC ATGCATATCG AAGCCCGACT GTTTGAGTTT 1150 
GTCGCCGCGT TCTTCGTGGT GACGGCGGTG CTGTACGGCG TGTTGACCTC 1200 
GATGTTCGCC ACCGGTGGTG TCGAGTGGGC TGGCACCACT GCGCTGGCGC 125 0 
TTACCGGCGG CATGGCGTTG ATCGTCGCCA CCTTCTTCCG GTTTGTGGCC 13 0 0 
GCGGAT 13 06 

0 (2) INFORMATION FOR SEQ ID NO: 68 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 9 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-823 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 8 






GGTGCCTGCC 


ATCGGTTCGC 


TGNGCCACNG 


CTGNCNNATC 


TTTGGTSTGT 


50 


TAGAGGTNWW 


CCGCGCGGAT 


RGCNCANTCC 


TGTTGGNGGG 


GGTTRTCGCC 


100 


ACGATTGCCG 


CCCGCGCTGA 


ACCCGACGAC 


GCCGATGCCC 


TGCCCACCAC 


150 


GGATCGGCTG 


NNMMCANCCG 


AGCGAACCGT 


GCAGNATGCN 


TNTKGTTGAC 


200 


GAGCCTGCTG 


GCGCCTTCGC 


NGGCNCTCGG 


CGACCATCGG 


TGCCATCGGA 


250 


ACCGCCGTNC 


GCAACCCACG 


G CAT C C AC AN 


GSTCCANGCA 


TGGCGGTATC 


300 


GCGNTTGGCC 


GNCGTCACCG 


GTGCGCTGCT 


GCTGCTAYGA 


GCACGTTCAG 


350 


CAGACACCAG 


AAGGTCACTG 


NTGTTTGCCA 


TCTGTNGGAA 


TCACCACCGT 


400 


TGCAACGGMA 


NTTGTACCGT 


CGCCGCGGAT 


CGGGCTCTGG 


AACACGGGCC 


450 


GTGGATTGSC 


GCGCTGACCG 


CCATGCTGGT 


CCNGCCGTGG 


CAANTGKKTT 


500 


TGGGCTTCGT 


NGCTCNCCGC 


GTTGTCGCTC 


TCGCCCGTCA 


CGTACCGCAC 


550 


CATCGAATTG 


CTGGAGTGTC 


TGGCGCTGAT 


CGCAATGGTT 


CCATTGACCG 


600 


CTNTGGSTAT 


NNNNNCGCCT 


ANCAGSSSCS 


TTCGCCACCT 


CGACCTGACA 


650 


TGGACATGAC 


CACNGTCCCG 


TNACCCTGCG 


CCTGNCTNGG 


TGGTMTCAGC 


700 


GNCNNNTCGY 


SACGCTGTCT 


GGSWTGGSRM 


RCGCNCGGTT 


GCGCCACGCG 


750 


GTTTCGCCG 










759 



(2) INFORMATION FOR SEQ ID NO : 6 9 
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(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 1041 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

10 (D) OTHER INFORMATION: HinPI#l-31 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 9 

GKTCNCGGTG ATGTCGACNG TCGGCACGRM GNCGAAACCT 
CAGTGTCTGC CCGAGGCCGC AGCCGACGTG CCCCNGGAGA 
ANCACGGTGC CGTACATGTA GCCCGCACGG CGCATCATCG 

15 GTAGATGTTT TCCTGCACGG CGTNCSCGGT GAACCCTCCG 
CGSCACCWNT TCCCGCGTCC ACGTCGGCCT GGGTGGTGAC 
CCACCGAAAT GATCGACATG GCTGTGGGTG TAGATGACCG 
GGCGGTCGGC TCCGCGGTGG GCGCGANTAC AAGTCCAGCG 
CACCTCGGTG GACANCCAAN CGGGYNYGAT GACGARWCWG 

20 CCNCWMMACG AAGNCTGATA TTGGAGATAT CGAATCCGCG 
ATGCCCGGCA CCACCTGGTA GAGGCCCTGT TTCGCGGTCA 
CCGCCACAGG CTGGGATGCA CCGATGTCGG CGCGGCACCG 
AGTACGCGTC GTTGTCCCAC ACCNACGCGA CCATCGGCAG 
ACACGGGGAC AGCGCGGCAA TGAATCCGCG ATCGGCGTCG 

25 TTGTGTCATN GCAACGGTNA ACGAGTGTTC ACCGTGTGCC 
ACGGCAGTNG GGAGGTTTGT GTTCCATCGG CACTACATTG 
GGTGCACGCC GGTAGATGCC GTTGGCGAAC CACGCTACCG 
GAGAATTTTC CGCCGCACCT AGACCTCGGG CCCTCTAACG 
CGAAGCGGTC CTCAATGCCG ATGGACCGCT ACGACAGGCA 

30 GGGTGAAGCG TGGACTGACG GNTCGCGGTA GCCGGAGCCG 
CGCAGGTCTT TCCGGATGTT CAAGCAACAA GTCGACTACA 
AGACCACGNA CCGCGNGCAG GCACGACNGC AAGCCCCGGC 
(2)' INFORMATION FOR SEQ ID NO: 7 0 
(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 799 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



CANCGGTCGA 50 

CCGCGCGCCA 100 

CCGAGCCGGC 150 

GCGCCAGCAC 2 00 

GCCGAGCACC 2 50 

SCGACCACGG 3 00 

CGGCGGCGGC 3 50 

CCCAGTGTCA 4 00 

GACCTGATAG 4 50 

GCTGGGATTG 50 0 

TCGAGNAACG 5 50 

CCTTGATCAC .6 0 0 

TCGAAATCCG 650 

GCCTGGNATG 7 00 

CCACTACTAC 750 

ACCAGAAAGA 8 00 

CGCATACTGC 850 

AAGGAGCACA 9 0 0 

CCATTCTGGT 9 50 

GGAAGCGGTG 100 0 

G 1041 
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(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

5 (D) OTHER INFORMATION: HinPI#l-3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 70 

AGATCNAYAC YANCANCANT GCNGTCATCG AGNTGCTGCA GGNCANGGTG 5 0 
GTCCGTTGGC GAACGTGCTN KGCCNAYACC GGTGCCTTCT CGGCGCNCTN 100 
GGYGCAYNGC GACCAGCTGA TCGGCGNAKG TAATCACCAA CCTCAANNKC 15 0 

10 GGTGCTNGCK ACCKTCGAYK GCAAAGAGYG YGCAATTTGT CGGCCAGTGT 200 
CGACCAGCTG CAGCAGCTGG TCAGCGGCCT GGCCAAGAAC CGGGATNCCG 2 50 
ANTSGNGGGC GCCATTTCGC CGCTGGNGTC GACGACGACG GATCTTWCGG 3 00 
AACTGTTGCG GAATTSGCGC ' CGGCCGCTGC AAGGCAKCCT GGAAAACGCC ' 3 50 
CGGCCGCTGG CTACCGAGCT GGACAACCGA AAGGCCNANG GTCAASAACG 4 00 

15 RRATCGAGCA NGCTCGGCGA GGACNATNCC TGCGCCTGTC CGCGCTGGGC 4 50 
AGTTACGGAG CANTTCGTTC AACATCTAST TSTGCTCGGT GACGATSAAG 500 
ATCAACGGAC CGGCCGGCAG CGACANTCCN TGCTGCCGAT CGGCGGCCAG 55 0 
CCGGANTCCC AGCAAGGGGA GGTGCGCCTT TGCNTAAATA GGAAGCCAAG 6 00 
TANGCAAASA CGAASGCSAC CCGTCCGCAC CGGNCATCTT CGGCCTGGTG 650 

20 CNTGGTGATC NTGNCGTCGT CCTGATSGNC ATTCGGCTAC AGCGGGTTGC 70 0 
CTKTCTGGCC ACAKKKCAAA ACCTACGACG CGTATTTCAC CGACGCCGGT 75 0 
GGGATCACCC CCGGTAACTC GGTTTATGTS TCGGGCCTCA AGGTGGGCG 79 9 
(2) INFORMATION FOR SEQ ID NO: 71 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 713 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-827 translation strand 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 71 

35 CTAYCSGCAA NGCTKNGCAG ACGCTCGGCT GCACNGCAGA ANTCGCGGTG 5 0 
CACCCACGAT TGCCAGTAGC GCGGGCCCAC TCGTGCCTAC TACACTTCGT 10 0 
CGTAGCCAAA TCANTCGGCC CCGTAGTATC TCCGGAGATG ACAGATGAAT 15 0 
GTCGTCGACA TTTCNGNCGG TGGCAGTTCG GTATCACCAC CGTSTATCAC 2 00 
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TTNCAWYTTC GTNACSYGYT GACCWWCGGC CTGGCNCNCC TKSTKANYRC 250 
GGNTCNAYGC AAACTGCTGT GGTCGTCACC GATAANCCCG CCTGGTATCG 300 
CCTCACCNAA ATTCTTCGGC AAATTGTTCC TGNATCNAAC NTTTGCCATC 3 50 
GGCGTGGCGA CCGGAATCGT GCAGGNAATK TCAGTTCGGC ATGAACTGGA 400 
5 GCGAGTACTC CCGATTCGTC GGCGATGTCT TCGGCGCCCC GCTGGCCATG 4 50 
GAGNSCTGGC GGCCTTNCTT CTTCGAATCC ACCTTCATCG GGTTGTGGAT 50 0 
CTTCGGCTGG AACAGGCTGC CCCGGCTGGT GCANTCTNGG CCTGCATCTG 550 
GNATCGTCGC AATNCGCNGG TNCAACGTGT CCGCGTTCTT CATCATCGCN 6 00 
GGCAAACTCC TTCATGCAGC ATCCGGTCGG CGCGCACTAC AACCCGACCA 650 
10 CCGGGCGTGC CGAGTTGAGC AGCATCGNTC NGTGNCNTGC TGACCAACAA 70 0 
CACCGCACAG GCG 7 13 

(2) INFORMATION FOR SEQ ID NO: 72 

(i) SEQUENCE CHARACTERISTICS: ■ ■ 

' (A) LENGTH: 2 74 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-834 translation strand 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 72 

CCGCAGCACC GAGGCAAGCA TCGCACCCGT CGATTCCCGC CATCCCGGCG 50 
25 ACATGATGGT CATGTCCGAC ACCGACGCCC GCACCTCGCT TCCCGAGTTG 10 0 
ACCGCGCTGC GCGTGGACGC CGCAACGGAT GCGTCGGTTC ATTCGATCCC 150 
GGCTCGAAAT TGGCCATGGC GAACGCATCT TGCTGTGATG GTTCGGGCAG 200 
TAGATCTCCA CTGCCGCACT GATAAACTCG GGTCATGGTC GTCGTGAGGC 2 50 
GGACAGGGTA GAGGCGCATG ACCG 2 74 

30 (2) INFORMATION FOR SEQ ID NO: 73 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 

(3) TYPE: nucleic acid 
(C) STRANDEDNESS: double 

35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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(ix) FEATURE: 

(D) OTHER INFORMATION : AciI#2-874 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 73 

GTGATGCCTT CCAGCATTGG ATTGGTCGTC GGTTCGATGC TGTGGCGACA 50 
5 GATAAACCGC CTGTTCGGGG TGCGTGGCCT CTGCTGGGCA GCGCACTGCT 100 
CAACGCCGCT CTGCGCTGCT GTGCATGGTG GCCGAGTCGT GTGGGCAGTG 150 
GGTTCACGCC TGGGCGTACT TCACGGCGTT CCTGCTGGCT ACGGTGGCCG 2 00 
CTCAAACGGT GGTCGCCGCA TCGATATCGT GGATCAGCGT CCTCGCGCCC 250 

GA 252 
10 (2) INFORMATION FOR SEQ ID NO: 74 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
20 (D) OTHER INFORMATION: Acil#2-1018 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 74 

GGCGCCGCCG TCGTGCTGGC CGCCCGGCCC GGTGGGGGTG CCGGCCAGCG 50 
TGGTTCCGCC AGTGGCCGCG CCGAACGTAT TGGCCGGCGT CCTCGAGCAC .100 
GACAACGACG GGTCGGGGGC GGCGGTGCTG GCCGCGCTGG CCAAGCTGCC 150 
25 ACCCGGTGGT - 16 0 

(2) INFORMATION FOR SEQ ID NO: 75 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 93 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: **HinPI#l-27 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 75 
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ATCAGCCGCG GGTCGACGCC GCCGATGACC TCGACGTCGT CGTCGTCGCT 50 
GCCGGTACTC AATCCAATCA CCATCCTCTT ACGCACCTTC TAGGAGTGTG 100 
TTGCTGCGGC AGTGCCGGCC ATTCGTAGAT TCGGGCCTCG CCGTTGTCGT 150 
AGATCTTCGC CCACGACCTC GATGTCTCTA ACGACACTAG TCCGTCCGGC 200 

5 ACGCAAACCC CGCACCGTCG GAGTGCTGGT CAGGTATAGA CGGTACAGGA 250 
GGACTTGGTA GGCCTCGAGT ACCGAGGTAC GTCTCCCGTT GCGGCATAGG 3 00 
CCAGAAGATG AACCGGTGTA GACCGGGCCT GTTGCGAGGG TCGTAGTCGT 3 50 
AGGTCCCAGA GGTGTCGGAC GCCCAGGTTA AT AC AC AG CG TGC 3 93 

(2) INFORMATION FOR SEQ .ID NO: 76 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 8 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: #2-14 7 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 76 

GCAGACCTCT GGCCGCTGGT GGTGCTGGGT ACCTGCGCTG GCGACACCGG 5 0 
ACCGCAGACC GTCAATCGGG ACTCCCGGGA ACGTGGTGCC ATCTTGCCAC 10 0 
GGGGATGGCC GACGCGGCTC GTCATTCTCC CCGAGCGCAC CGGCCGCCGC 150 
TGTTGACCGG GCCGCGGCGA CTGATGGTGC CCGCACACGC GGGCGGGTTC 2 00 

25 AAGGAGCAAT ACGCCAAGTC CAGCGCCGCT CTCGCACGGC GCGGTGTT 24 8 
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I claim: 

1. An isolated Mycobacterium tuberculosis nucleic acid sequence including a sequence selected from the group 
consisting of Seq. I.D. Nos. i - 76. 

5 2. A purified immunostimulatory peptide encoded by a sequence according to claim 1. 

3. An antibody that specifically binds to a peptide according to claim 2. 

4. A vaccine preparation comprising at least one immunostimulatory peptide according to claim 2 and a 
10 pharmaceutically acceptable excipient. 

5. A purified immunostimulatory peptide encoded by a nucleotide sequence selected from the group consisting 
of Seq. I.D. Nos. 1-76. 

15 6 A va ccine preparation comprising at least one peptide according to claim 5 and a pharmaceutically acceptable 

excipient. 

7. A purified immunostimulatory Mycobacterium tuberculosis peptide, the peptide including at least 5 contiguous 
amino acids encoded by a nucleic acid sequence selected from the group consisting of Seq. I.D. Nos. 1 - 76. 

20 

8. A vaccine preparation comprising at least one pepcide according to claim 7 and a pharmaceutically 
acceptable excipient. 

9. A peptide according to claim 7 wherein the peptide inclucV- at least 10 contiguous amino acids encoded by a 
25 nucleic acid sequence selected from the group consisting of Seq. I.D. Nos. 1 - 76. 

10. A vaccine preparation comprising at least one peptide according to claim 9 and a pharmaceutically 
acceptable excipient. 

1 1 . A method of making a vaccine comprising: 

providing at least one purified peptide encoded by a nucleotide sequence selected from the group consisting of 
Seq. ID. Nos 1 - 76: 

combining the peptide with a pharmaceutically acceptable excipient. 



35 
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12. An isolated nucleic acid molecule having a nucleotide sequence selected from the group consisting of: 

(a) Seq. ID Nos. 1 - 76: 

(b) nucleotide sequences complementary to a sequence defined in (a); and 

(c) nucleic acid molecules of at least 15 nucleotides in length which hybridize under conditions of at least 15% 
stringency to a sequence defined in (a) or (b). 

13. A recombinant DNA vector including a nucleic acid molecule accordine to claim 12. 



14. A transformed cell containing a vector according to claim 13. 
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15. A nucleic acid probe comprising a nucleic acid molecule according to claim 12 and a diagnostic label. 

16. A method of isolating a Mycobacterium tuberculosis gene which gene encodes an immunostimulatory 
peptide, the method comprising the steps of: 

providing nucleic acids of Mycobacterium tuberculosis; 

contacting said nucleic acids with a probe or primer, the probe or primer comprising at least 15 
contiguous nucleotides of a polynucleotide having a nucleotide sequence selected from the group" consisting of Seq. ID 
Nos. 1 - 76 and sequences complementary thereto; and 

isolating the Mycobacterium tuberculosis gene. 

17. An isolated Mycobacterium tuberculosis gene produced by the method of claim 16. 

18. An isolated Mycobacterium tuberculosis nucleic acid molecule, said molecule encoding an 
immunostimulatory peptide and hybridizing under conditions of at least 15% stringency to a nucleic acid probe 
comprising at least 20 contiguous bases of a sequence selected from Seq. ID Nos. 1-76. 

19. A purified immunostimulatory peptide encoded by the nucleic acid molecule of claim 18. 

20. An immunostimulatory preparation comprising: 
a purified peptide according to claim 19: and 

a pharmaceutical ly acceptable excipient. 

21 . An improved tuberculin skin test, the improvement comprising the use of one or more immunostimulatory 
peptides according to claim 19. 

22. A vaccine preparation comprising an immunostimulatory membrane peptide isolated from Mycobacterium 
tuberculosis and a suitable excipient. 

23. A method of detecting the presence of Mycobacterium tuberculosis DNA in a sample comprising contacting 
the sample with a nucleic acid probe according to claim 15 and detecting hybridization products that include the nucleic 
acid probe. 

24. A method of detecting the presence of Mycobacterium tuberculosis DNA in a sample comprising: 
selecting two or more nucleic acid primer molecules from the nucleic acid molecules defined in claim 12. said 

molecules suitable for amplification of a Mycobacterium tuberculosis target sequence: 

incubating the sample under conditions suitable to amplify the target sequence: and 
detecting an amplified product. 

25. A method of detecting the presence of a Mycobacterium tuberculosis peptide in a sample comprising 
contacting the sample with an antibody according to claim 3 and detecting the presence of an antibody-peptide complex. 

26. A method of detecting the presence of an ^-Mycobacterium tuberculosis antibody in a sample comprising 
contacting the sample with a peptide according to claim 2 and detecting the presence of an antibody-peptide complex. 
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GTGATACAGGAGGCGCCAACAGTGACACCTCGCGGGCCAGGTCGTTTGCAACGCTTGTCGCAGTGCAGGC 

I — — — — I — ■ * - ■ ■ ■ 1 ■ ■ — — H — ■ — 1 \ ■ ■ ■ i ■ ■ ■ ■ | i . ■ ■ ■ j , » • — f- 70 

CACTATGTCCTCCGCGGTTGTCACTGTGGAGCGCCCGGTCCAGCAAACGTTGCGAACAGCGTCACGTCCG 

fITPRGPGRLQRLSQCR 

CTCAGCGCGGCTCCGGAGGGCCTGCCCGTGGTCTTCGACAGCTGGCGCTCGCAGCAATGCTGGGGGCATT 

»« ! '■■ — 1 * ■ ■ ■ t 1 ' * * I ' * ' ' i « — — \ — ~ — i 1 1 1 7 — ~h ~f 140 

GAGTCGCGCCGAGGCCTCCCGGACGGGCACCAGAAGCTGTCGACCGCGAGCGTCGTTACGACCCCCGTAA 

PQRGSGGPARGLRGLALAAMLGAL 

GGCCGTCACCGTCAGTGGATGCAGCTGGTCGGAAGCCCTGGGCATCGGTTGGCCGGAGGGCATTACCCCG 

—i — - — 1 — — ■ * I — ■ ■ ■ I | | ■ ... i .... 210 

CCGGCAGTGGCAGTCACCTACGTCGACCAGCCTTCGGGACCCGTAGCCAACCGGCCTCCCGTAATGGGGC 

AVTYSGCSWSEALGI GWPE G I TP 

GAGGCACACCTCAATCGAGAACTGTGGATCGGGGCGGTGATCGCCTCCCTGGCGGTTGGGGTAATCGTGT 

— , — ... | .... i .... | ■ j — — I ■ > | ■ — ~+ 280 

CTCCGTGTGGAGTTAGCTCTTGACACCTAGCCCCGCCACTAGCGGAGGGACCGCCAACCCCATTAGCACA 
EAHLNRELWI GAYIASLAVGV 1 V 

GGGGTCTCATCTTCTGGTCCGCGGTATTTCACCGGAAGAAGAACACCGACACTGAGTTGCCCCGCCAGTT 
^ — \ — ■ . ■ t . . . . t — . . — . . ■ | . ... i ... . i ■ ■ ■ ■ i 1 .... i . — — h ■ i ■ ■ ■ ■ | 350 

CCCCAGAGTAGAAGACCAGGCGCCATAAAGTGGCCTTCTTCTTGTGGCTGTGACTCAACGGGGCGGTCAA 
W G.L I FWSAVFHRKKNTDTELPRQF 

CGGCTACAACATGCCGCTAGAGCTGGTTCTCACCGTCATACCGTTCCTCATCATCTCGGTGCTGTTTTAT 
1 ■ . i ■ ■ ■ . i ■ ■ — i ... — ■ ■ ■ i .... — n — — H — 1 — f 420 

GCCGATGTTGTACGGCGATCTCGACCAAGAGTGGCAGTATGGCAAGGAGTAGTAGAGCCACGACAAAATA 
GYNMPLELVLTVIPFLI ISVLFY 

TTC ACCGTCGTGGTGCAGGAGAAGATGCTGC AGATAGCCAAGGATCCCGAGGTCGTGATTGATATCACGT 

— 1 — H— 1 ' 1 * ' I 1 1 — —4 — 1 ■ — H — ■ ■ i ■ ■ ■ . 1 — 1 1 ■ ■ ■ 1 490 

AAGTGGCAGCACC ACGTCCTCTTCTACGACGTCTATCGGTTCCTAGGGCTCC AGCACTAACTATAGTGCA 

FT.VVVQEKMLO I AKOPEVV I 0 I T 

C TTTCCAGTGGAATTGGAAGTTTGGCTATCAAAGGGTGAACTTCAAAGACGGCACACTGACCTATG ATGG 
_ 1 , 1~ — ^ 1 ■ ■■■ i — — i ■■ i .... | — ^ — t 1 1- 560 

G AAAGG TCACCTTAACCTTC AAACCGATAGTTTCCC ACTTGAAGTTTCTGCCGTGTGACTGGATAC TACC 
S FQWNWKFGYQRVNFKO GTLT YOG 

TGCCGATCCGGAGCGCAAGCGCGCC ATGGTTTCC AAGCC AGAGGGC AAGGAC AAGTACGGCGAAGAGCTG 

— 1 -h 1 — t . ■ ■ ■ i ■ ■ ■ ■ \ ■ ■ ■ — i ! ■ — i — H >— h 630 

ACGGCTAGGCCTCGCGTTCGCGCGGTACCAAAGGTTCGGTCTCCCGTTCCTGTTCATGCCGCTTCTCGAC 
AD P E RK RAttV SKPEG K 0 K Y G E E L 
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GTCGGGCCGGTGCGCGGGCTCAACACCGAGGACCGGACCTACCTGAATTTCGACAAG GTCGAGACGTTGG 

CAGCCCGGCCACGCGCCCGAGTTGTGGCTCCTGGCCTGGATGGACTTAAAGCTGTTCCAGCTCTGCAACC 7 °° 
VGPVRGLNTEDRTYLNFDKVETL 

GCACCAGCACCGAAATTCCGGTGCTGGTGCTGCCGTCCGGCAAGCGTATCGAATTCCAAATGG CCTCAGC 

CGTGGTCGTGGCTTTAAGGCCACGACCACGACGGCAGGCCGTTCGCATAGCTTAAGGTTTACCGGAGTCG ?7 ° 
GTS TElPVLVLPSGKRIEF Q.-M ASA 

CGATGTGATACACGCATTCTGGGTGCCGGAGTTCTTGTTCAAGCGTGACGTGATGCCTAACCCGGTGGCA 

1 1 ' 1 ~ — ■• < ■■■ — i — ~ — | ■ ■ • ■ i ■ ■ — h — — — i ■ | . ■ . — R un 

GCTACACTATGTGCGTAAGACCCACGGCCTCAAGAACAAGTTCGCACTGCACTACGGATTGGGCCACCGT 

Dv tHAFWVPEF LFKRDVMPNPVA 

AACAACTCGGTCAACGTCTTCCAGATCGAAGAAATCACCAAGACCGGAGCATTCGTGG GCCACTGCGCCG 
TTGTTGAGCCAGTTGCAGAAGGTCTAGCTfCTTTAGTGGTTCTGGCCTCGTAAGCACCCGGTGACGCGGC ^ 

nns vnvfqieeitkt gafvghca' 

AGATGTGTGGCACGTATCACTCGATGATGAACTTCGAGGTCCGCGTCGTGACCCCCAAC GATTTCAAGGC 

tctacacaccgtgcatagtgagctactacttgaagctccaggcgcagcactgggggttgctaaagttccg 980 
emc gtyhsmmnfevrvvtpnofka 

ctacctgcagcaacgcatcgacgggaatacaaacgccgaggccctgcgggcgatcaacca gccgcccctt 
gatggacgtcgttgcgtagctgcccttatgtttgcggctccgggacgcccgctagttggtcggcggggaa 1050 

Y L Q Q R I OGNTNAEALRA I NQPP L 

gcggtgaccacccacccgtttgatactcgccgcggtgaattggccccgcagcccgtaggttaggacgc tc 
cgccactggtgggtgggcaaactatgagcggcgccacttaaccggggcgtcgggcatccaatcctgcgag 1,20 

AVT THPFOTRRGELAPQP VG. 
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Historically it has been thought that one needs replicating Mycobacteria in order to effect a protective 
immunization. An hypothesis explaining the molecular basis for the effectiveness of replicating mycobacteria in 
inducing protective immunity has been proposed by Ormc and co-workers (3). These scientists suggest that 
antigens are pinocytosed from the mycobacterial- laden phagosome and used in antigen presentation. This 
hypothesis also explains the basis for secreted proteins effecting a protective immune response. 

Antigens that stimulate T cells from tuberculosis infected mice or from PPD-positive humans are found 
in both the whole mycobacterial cells and also in the culture supernatants (3, 4. 5-7. 34). 'Recently Pal and 
Horwitz (8) were able to induce partial protection in guinea pigs by vaccinating with M. tuberculosis supernatant 
fluids. Similar results were found by Andersen using a murine model of tuberculosis (9). Other studies include 
reference nos. 34, 12. Although these works are far from definitive they do strengthen the notion that protective 
epitopes can be found among secreted proteins and that a non-living vaccine can protect against tuberculosis. 

For the purposes of vaccine development one needs to find epitopes that confer protection but do not 
contribute to pathology. An ideal vaccine would contain a cocktail of T-ceil epitopes that preferentially stimulate 
Thl ceils and are bound by different MHC haplotypes. Although such vaccines have never been made there is at 
least one example of a synthetic T-ceil epitope inducing protection against an intracellular pathogen (10). It is an 
object of this invention to provide M. tuberculosis DNA sequences that encode bacterial peptides having an 
immunostimulatory activity. Such immunostimulatory peptides will be useful in the treatment, diagnosis and 
prevention of tuberculosis. 
II . SUMMARY OF THE INVENTION 

The present invention provides DNA sequences isolated from Mycobacterium tuberculosis. Peptides 
encoded by these DNA sequences are shown to stimulate the production of the macrophage-stimulating cytokine, 
gamma interferon ("INF-7"). in mice. Critically, the production of INF- 7 by CD4 cells in mice has been shown to 
correlate with maximum expression of protective immunity against tuberculosis (11). Furthermore, in human 
patients with active -minimal" or "contained- tuberculosis, it appears that the containment of the disease may be 
attributable, at least in pan. to the production of CD4 Th-l-like lymphocytes that release INF- 7 (12). 

Hence, the DNA sequences provided by this invention encode peptides that are capable of stimulating 
T-cells to produce INF- 7 . That is, these peptides act as epitopes for CD4 T-cells in the immune system. Studies 
have demonstrated that peptides isolated from an infectious agent and which are shown to be T-cell epitopes can 
protect against the disease caused by that agent when administered as a vaccine (13. 10). For example, T-cell 
epitopes from the parasite Leishmania major have been shown to be effective when administered as a vaccine (10, 
13-14). Therefore, the immunostimulatory peptides (T-cell epitopes) encoded by the disclosed DNA sequences 
may be used, in purified form, as a vaccine against tuberculosis. 

As noted, the nucleotide sequences of the present invention encode immunostimulatory peptides. In a 
number of instances, these nucleotide sequences are only a pan of a larger open reading frame (ORF) of an 
M. tuberculosis operon. The present invention enables the cloning of the complete ORF using standard molecular 
biology techniques, based on the nucleotide sequences provided herein. Thus, the present invention encompasses 
both the nucleotide sequences disclosed herein and the complete M tuberculosis ORFs to which they correspond. 
However, it is noted that since each of the nucleotide sequences disclosed herein encodes an immunostimulatory 
peptide, the use of larger peptides encoded by the complete ORFs is not necessary for the practice of the invention. 
Indeed, it is anticipated that, in some instances, proteins encoded by the corresponding ORFs may be less 
immunostimulatory than the peptides encoded by the nucleotide sequences provided herein. 

One aspect of the present invention is an immunostimulatory preparation comprising at least one peptide 
encoded by the DNA sequences presented herein. Such a preparation may include the purified peptide or peptides 
and one or more pharmaceutical^ acceptable adjuvants, diluents and/or excipients. Another aspect of the 
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MYCOBACTERIUM TUBERCULOSIS DNA SEQUENCES ENCODING 
IMMUNOSTIMULATORY PEPTIDES 

CROSS REFERENCE TO RELATED CASES 
5 This application claims the benefit of U.S. Provisional Application No. 60/000,254. filed June 15, 1995. 

which is incorporated herein by reference. 
I. BACKGROUND 

A. THE RISE OF TUBERCULOSIS 

Over the past few years the editors of the Morbidity and Mortality Weekly Report have chronicled the 
10 unexpected rise in tuberculosis cases. It has been estimated that worldwide there are one billion people infected 

with M. tuberculosis, with 7.5 million active cases of tuberculosis. Even in the United States, tuberculosis 

continues to be a major problem especially among the homeless. Native Americans. African -Americans. 

immigrants, and the elderly. HIV-infected individuals represent the newest group to be affected by tubercuiosis. 

Of the 88 million new cases of tuberculosis expected in this decade approximately 10% wilt be attributable to HIV 
15 infection. 

The emergence of multi-drug resistant strains of M. tuberculosis has complicated matters further and even 
raises the possibility of a new tuberculosis epidemic. In the U.S. about 14% of M. tuberculosis isolates are 
resistant to at least one drug, and approximately 3% are resistant to at least two drugs. M. tuberculosis strains 
have even been isolated that are resistant to all seven drugs in the repertoire of drugs commonly used to combat 

20 tuberculosis. Resistant strains make treatment of tuberculosis extremely difficult: for example, infection with M. 
tuberculosis strains resistant to isoniazid and rifampin leads to mortality rates of approximately 90% among HIV- 
infected individuals. The mean time to death after diagnosis in this population is 4-16 weeks. One study reported 
that of nine immunocompetent health care workers and prison guards infected with drug resistant M. tuberculosis. 
five died. The expected mortality rate for infection with drug sensitive A/, tuberculosis is 0%. 

25 The unrelenting persistence of mycobacterial disease worldwide, the emergence of a new, highly 

susceptible population, and the recent appearance of drug resistant strains point to the need for new and better 
prophylactic and therapeutic treatments of mycobacterial diseases. 

B. TUBERCULOSIS AND THE IMMUNE SYSTEM 

Infection with M. tuberculosis can take on many manifestations. The growth in the body of M. 

30 tuberculosis and the pathology that it induces is largely dependent on the type and vigor of the immune response. 
From mouse genetic studies it is known that innate properties of the macrophage play a large role in containing 
disease (I). Initial control of M. tuberculosis may also be influenced by reactive y6 T cells. However, the major 
immune response responsible for containment of M. tuberculosis is via helper T cells (Thl) and to a lesser extent 
cytotoxic T cells (2). Evidence suggests that there is very little role for the humoral response. The ratio of 

35 responding Thl to Th2 cells has been proposed to be involved in the phenomenon of suppression. 

Thl cells are thought to convey protection by responding to M. tuberculosis T cell epitopes and secreting 
cytokines, particularly interferon-7, which stimulate macrophages to kill M. tuberculosis . While such an immune 
response normally clears infections by many facultative intracellular pathogens, such as Salmonella. Listeria or 
Francisella. it is only able to contain the growth of other pathogens such as M. tuberculosis and Toxoplasma. 

40 Hence, it is likely that M. tuberculosis has the ability to suppress a clearing immune response, and mycobacterial 
components such as lipoarabinomannan arc thought to be potential agents of this suppression. Dormant M. 
tuberculosis can remain in the body for long periods of time and can emerge to cause disease when the immune 
system wanes due to age or other effects such as infection with HIV-1 
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"Probes- and -primers-. Nucleic acid probes and primers may readily be prepared based on the nucleic 
acid sequences provided by this invention. A -probe- comprises an isolated nucleic acid attached to a detectable 
label or reporter molecule. Typical labels include radioactive isotopes, ligands. chemiluminescent agents, and 
enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are 
discussed, e.g., in reference nos. 15 and 16. 

-Primers" are short nucleic acids, preferably DNA oligonucleotides 15 nucleotides or more in length, 
which are annealed to a complementary target DNA strand by nucleic acid hybridization to' form a hybrid between 
the primer and the target DNA strand, then extended along the target DNA strand by a DNA polymerase enzyme. 
Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction 
(PCR) or other nucleic-acid amplification methods known in the an. 

As noted, probes and primers are preferably 15 nucleotides or more in length, but. to enhance specificity, 
probes and primers of 20 or more nucleotides may be preferred. 

Methods for preparing and using probes and primers are described, for example, in reference nos. 15. 16 
and 17. PCR primer pairs can be derived from a known sequence, for example, by using computer programs 
intended for that purpose such as Primer (Version 0.5. * 1991. Whitehead Institute for Biomedical Research. 
Cambridge. MA). 

"Substantial similarity". A first nucleic acid is -substantially similar" to a second nucleic acid if. when 
optimally aligned (with appropriate nucleotide insertions or deletions) with the other nucleic acid (or its 
complementary strand), there is nucleotide sequence identity in at least about 75%-90% 0 f the nucleotide bases, 
and preferably greater than 90% of the nucleotide bases. (-Substantial sequence complementarity" requires a 
similar degree of sequence complementarity.) Sequence similarity can be determined by comparing the nucleotide 
sequences of two nucleic acids using sequence analysis software such as the Sequence Analysis Software Package 
of the Genetics Computer Group, University of Wisconsin Biotechnology Center. Madison, WI). 

-Operably linked-. A first nucleic acid sequence is -operably" linked with a second nucleic acid sequence 
when the first nucleic acid sequence is placed in a functional relationship with the nucleic acid sequence/ For 
instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression 
of the coding sequence. Generally, operably linked DNA sequences are contiguous and. where necessary to join 
two protein coding regions, in the same reading frame. 

-Recombinant". A "recombinant" nucleic acid is one that has a sequence that is not naturally occurring or 
has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This 
artificial combination is often accomplished by chemical synthesis or. more commonly, by the artificial 
manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. 

-Stringent Conditions" and -Specific". The nucleic acid probes and primers of the present invention 
hybridize under stnngem conditions to a target DNA sequence, e.g., to a full length Mycobacterium tuberculosis 
35 gene that encodes an immunostimulatory peptide. 

The term "stringent conditions" is functionally defined with regard to the hybridization of a nucleic-acid 
probe to a target nucleic acid (i.e.. to a particular nucleic acid sequence of interest) by the hybridizat.on procedure 
discussed in Sambrook et al. (1989) (reference no. 15) at 9.52-9.55. See also, reference no. 15 at 9.47-9.52. 
9.56-9.58: reference no. 18 and reference no. 19. 

Nucleic-acid hybridization is affected by such conditions as salt concentration, temperature, or organic 
solvents, in addition to the base composition, length of the complementary strands, and the number of nucleotide- 
base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the an. 

In preferred embodiments of the present invention, stringent conditions are those under which DNA 
molecules with more than 25% sequence variation (also termed "mismatch") will not hybridize. Such conditions 
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invention is a vaccine comprising one or more peptides encoded by nucleotide sequences provided herein. This 
vaccine may also include one or more pharmaceutical ly acceptable excipients, adjuvants and/or diluents. 

Another aspect of the present invention is an antibody specific for an immunostimulatory peptide encoded 
by a nucleotide sequence of the present invention. Such antibodies may be used to detect the present of Af 

5 tuberculosis antigens in medical specimens, such as blood or sputum. Thus, these antigens may be used to 
diagnose tuberculosis infections. 

The present invention also encompasses the diagnostic use of purified peptides encoded by the nucleotide 
sequences of the present invention. Thus, the peptides may be used in a diagnostic assay to detect the presence of 
antibodies in a medical specimen, which antibodies bind to the M. tuberculosis peptide and indicate that the subject 

10 from which the specimen was removed was previously exposed to M. tuberculosis. 

The present invention also provides an improved method of performing the tuberculin skin test to diagnose 
exposure of an individual to Af tuberculosis. In this improved skin test, purified immunostimulatory peptides 
encoded by the nucleotide sequences of this invention are employed. Preferably, this skin test is performed with 
one set of the immunostimulatory peptides, while another set of the immunostimulatory peptides is used to 

15 formulate vaccine preparations. In this way, the tuberculin skin test will be useful in distinguishing between 
subjects infected with tuberculosis and subjects who have simply been vaccinated. In this manner, the present 
invention may overcome a serious limitation inherent in the present BCG vaccine/tuberculin skin test combination. 

Other aspects of the present invention include the use of probes and primers derived from the nucleotide 
sequences disclosed herein to detect the presence of M. tuberculosis nucleic acids in medical specimens. 

20 A further aspect of the present invention is the discovery that a significant proportion of the 

immunostimulatory peptides are homologous to proteins known to be located in bacterial cell surface membranes. 
This discovery suggests that membrane-bound peptides, particularly those from M. tuberculosis, may be a new 
source of antigens for use in vaccine preparations. 

III. BRIEF DESCRIPTION OF THE DRAWINGS 

25 Fig. 1 shows the deduced amino acid sequence of the full length MTB2-92 protein. 

Fig. 2 shows an SDS polyacrylamide gel (12%) representing the different stages of the purification of 
MTB2-92 Lane 1:- Molecular weight markers (high range. GIBCO-BRL. Grand Island. NY. U.S.A.): Lane 2> 
the IPTG induced crude bacterial lysate of E. coli JMI09 containing pMAL-MTB2-92: Lane 3:- Uninduced crude 
bacterial lysate of E. coli JM109 containing pMAL-MTB2-92: Lane 4:- Eluaie from the amylose-resin column 

30 containing the MBP-MTB2-92 fusion protein; Lane 5:- Eluaie shown in previous lane after cutting with protease 
Factor Xa; Lane 6:- Eluate from the Ni-NTA column, containing MTB2-92. 

IV. DESCRIPTION OF THE INVENTION 

A. DEFINITIONS 

Particular terms and phrases used herein have the meanings set forth below. 

35 "Isolated". An "isolated" nucleic acid has been substantially separated or purified away from other nucleic 

acid sequences in the cell of the organism in which the nucleic acid naturally occurs, i.e.. other chromosomal and 
extrachromosomal DNA and RNA. The term "isolated" thus encompasses nucleic acids purified by standard 
nucleic acid purification methods. The term also embraces nucleic acids prepared by recombinant expression in a 
host cell as well as chemically synthesized nucleic acids. 

40 The nucleic acids of the present invention comprise at least a minimum length able to hybridize specifically 

with a target nucleic acid (or a sequence complementary thereto) under stringent conditions as defined below. The 
length of a nucleic acid of the present invention is preferably 15 nucleotides or greater in length, although a shorter 
nucleic acid may be employed as a probe or primer if it is shown to specifically hybridize under stringent 
conditions with a target nucleic acid by methods well known in the an. 
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plating clones on agar plates containing the indicator 5-bromo-4-chloro-3-indolyl-phosphate. Alkaline phosphatase 
converts this indicator to a blue colored product. Hence, those clones containing secreted alkaline phosphatase 
fusion proteins will produce the blue color. 

The three vectors in this series (pi DTI. 2 and 3) have the BstBl restriction sites located in different reading 
frames with respect to the phoA gene. This increases the likelihood of cloning any particular gene in the correct 
orientation and reading frame for expression by a factor of 3. Reference no. 31 describes pJDT vectors in detail. 

3. SELECTION OF SECRETED FUSION PROTEINS 
The recombinant clones described above were transformed into E. coli and plated on agar plates containing 

the indicator 5-bromo-4-chloro-3-indolyl-phosphate. Production of blue pigmentation, produced as a result of the 
action of alkaline phosphatase on the indicator, indicated the presence of secreted cytoplasmic membrane 
penplasmic. cell wall associated or outer membrane fusion proteins (because the bacterial alkaline phosphatase 
gene in the vector lacks a signal sequence and could not otherwise escape the bacterial cell). A simitar technique 
has been used to identity M. tuberculosis genes encoding exported proteins by Lim et al. (32). 

Those clones producing blue pigmentation were picked and grown in liquid culture to facilitate the 
15 purification of the alkaline phosphatase fusion proteins. These recombinant clones were designated according to 
the restriction enzyme used to digest the Mycobacterium tuberculosis DNA (thus, clones designated A£2-l. A#2-2 
etc were produced using Mycobacterium tuberculosis DNA digested with Acil). 

4. PURIFICATION OF SECRETED FUSION PROTEINS 
PhoA fusion proteins were extracted from the selected E. coli clones by cell lysis and purified by SDS 

potyacrylamide gel electrophoresis. Essentially, individual E. coli clones are grown overnight at 30°C with shaking 
in 2 ml LB broth containing ampicillin. kanamycin and IPTG. The cells are precipitated by centrifugation and 
resuspended in 100 ptL Tris -EDTA buffer. 100 »L lysis buffer (1 % SDS. ImMEDTA. 25mM DTT. 10% 
glycerol and 50 mM tris-HCl. pH 7.5) is added to this mixture and DNA released from the cells is sheared by 
passing the mixture through a small gauge syringe needle. The sample is then heated for 5 minutes at 100°C and 
25 loaded onto an SDS PAGE gel (12 cm x 14 cm x 1.5 mm. made with 4% (w/v) acrylamide in the stacking section 
and 10% (w/v) acrylamide in the separating section). Several samples from each clone are loaded onto each gel. 

The samples are electrophoresed by application of 200 volts to the gel for 4 hours. Subsequently, the 
proteins are transferred to a nitrocellulose membrane by Western blotting. A strip of nitrocellulose is cut off to be 
processed with antibody, and the remainder of the nitrocellulose is set aside for eventual elution of the protein. 
30 The strip is incubated with blocking buffer and then with anti-alkaline phosphatase primary antibody, followed by 
incubation with ami-mouse antibody conjugated with horse radish peroxidase. Finally, the strip is developed with 
the NEN DuPont Renaissance kit to generate a luminescent signal. The migratory position of the PhoA fusion 
protein, as indicated by the luminescent label, is measured with a ruler, and the corresponding region of the 
undeveloped nitrocellulose blot is excised. 
35 Th'S region of nitrocellulose, which contains the PhoA fusion protein, is then incubated in 1 ml 20% 

aceironurile at 37°C for 3 hours. Subsequently, the mixture is centrifuged to remove the nitrocellulose and the 
liquid is transferred to a new test tube and lyophilized. The resulting protein pellet is dissolved in 100 u.L of 
endotoxin-free. sterile water and precipitated with acetone at -20°C After centrifugation the bulk of the acetone is 
removed and the residual acetone is allowed, to evaporate. The protein pellet is re-dissolved in 100 of sterile 
phosphate buffered saline. This procedure can be scaled up by modification to include IPTG induction 2 hours 
prior to cell harvesting, washing nitrocellulose membranes with PBS prior to acetonirriie extraction and 
lyophilization of acetonitrile extracted and acetone precipitated protein samples. 
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are also referred to as conditions of 75% stringency (since hybridization will occur only between molecules with 
75% sequence identity or greater). In more preferred embodiments, stringent conditions arc those under which 
DNA molecules with more than 15% mismatch will not hybridize {conditions of 85% stringency). In most 
preferred embodiments, stringent conditions are those under which DNA molecules with more that 10% mismatch 
5 will not hybridize (i.e. conditions of 90% stringency). 

When referring to a probe or primer, the term "specific for (a target sequence)" indicates that the probe or 
primer hybridizes under stringent conditions substantially only to the target sequence in a given sample comprising 
the target sequence. 

"Purified" - a "purified" peptide is a peptide that has been extracted from the cellular environment and 
0 separated from substantially al! other cellular peptides. As used herein, the term peptide includes peptides. 

polypeptides and proteins. In preferred embodiments, a "purified" peptide is a preparation in which the subject 
peptide comprises 80% or more of the protein content of the preparation. For certain uses, such as vaccine 
preparations, even greater purity may be necessary. 

"Immunostimulatory" - the phrase "immunostimulatory peptide" as used herein refers to a peptide that is 
5 capable of stimulating INF-7 production in the assay described in section B 5 below. In preferred embodiments, 
an immunostimulatory peptide is one capable of inducing greater than twice the background level of this assay 
determined using T-cells stimulated with no antigens or negative control antigens. Preferably, the 
immunostimulatory peptides are capable of inducing more than 0 0 1 ng/ml of INF-7 in this assay system. In 
more preferred embodiments, an immunostimulatory peptide is one capable of inducing greater than 10 ng/ml of 
0 INF-7 in this assay system. 

B, MATERIALS AND METHODS 

1. STANDARD METHODOLOGIES 
The present invention utilizes standard laboratory practices for the cloning, manipulation and sequencing of 
nucleic acids, purification and analysis of proteins and other molecular biological and biochemical techniques. 
5 unless otherwise stipulated. Such techniques are explained in detail in standard laboratory manuals such as 
Sambrook et al. (15): and Ausubel et al. (16). 

Methods for chemical synthesis of nucleic acids are discussed, for example, in reference nos. 20 and 21. 
Chemical synthesis of nucleic acids can be performed, for example, on commercial automated oligonucleotide 
synthesizers. 

0 2. ISOLATION OF MYCOBACTERWMTUBERCULOSISDNA SEQUENCES 

ENCODING IMMUNOSTIMULATORY PROTEINS 

Mycobacterium tuberculosis DNA was obtained by the method of Jacobs et al. (22). Samples of 
the isolated DNA were partially digested with one of the following restriction enzymes HinPl. //pall. Aci\, Taq\ t 
BsaHl, Nar\. Digested fragments of 0-2-5kb were purified from agarose gels and then ligated into the BstBl site 

5 in front of the truncated phoA gene in one or more of the three phagemid vectors pJDTl, p"JDT2. and JDT3. 

A schematic representation of the phagemid vector pJDT2 is provided in Mdluli et al. (1995) (reference 
no. 31). The pJDT vectors were specifically designed for cloning and selecting genes encoding cell wall- 
associated, cytoplasmic membrane associated, periplasmic or secreted proteins (and especially for cloning such 
genes from GC rich genomes, such as the Mycobacterium tuberculosis genome). The vectors have a BstBl cloning 

0 site in frame with the bacterial alkaline phosphatase gene (phoA) such that cloning of an in- frame sequence into the 
cloning site will result in the production of a fusion protein. The phoA gene encodes a version of the alkaline 
phosphatase that lacks a signal sequence; hence, only if the DNA cloned into the BstBi site includes a signal 
sequence or a transmembrane sequence can the fusion protein be secreted to the medium or inserted into 
cytoplasmic membrane, periplasm or cell wall. Those clones encoding such fusion proteins may be delected by 
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Cytokines arc measured using an Enzyme Linked immunosorbent Assay (ELISA), the details of which are 
described in the Cytokine ELISA Protocol in the PharMingen catalogue (PharMingen. San Diego. California). For 
measuring for the presence of human gamma- interferon, wells of a 96 well microtitre plate are coated with a 
capture antibody (ami-human gamma-interferon antibody). The sample supernatants are then added to individual 
wells. Any gamma-interferon present in the sample will bind to the capture antibody. The wells are then washed. 
A detection antibody (anti-human gamma-interferon antibody), conjugated to biotin, is added to each well, and will 
bind to any gamma-interferon that is bound to the capture antibody. Any unbound detection antibody is washed 
away. An avidin peroxidase enzyme is added to each well (avidin binds tightly to the biotin on the detection 
antibody). Any excess unbound enzyme is washed away. Finally, a chromogenic substrate for the enzyme is 
added and the intensity of the colour reaction that occurs is quantitated using an ELISA plate reader. The quantity 
of the gamma-interferon in the sample supernatants is determined by comparison with a standard curve using 
known quantities of human gamma-interferon. 

Measurement of other cytokines, such as imerleukin-2 and imerleukin-4. can be determined using the same 
protocol, with the appropriate substitution of reagents (monoclonal antibodies and standards). 
15 7. DNA SEQUENCING 

The sequencing of the alkaline phosphatase fusion clones was undertaken using the AmptiCycle thermal 
sequencing kit (Perkin Elmer, Applied Biosystems Division. 850 Lincoln Centre Drive, Foster City. CA 94404. 
U.S.A.). using a primer designed to read out of the alkaline phosphatase gene into the Mycobacterium tuberculosis 
DNA insert, or primers specific to the cloned sequences. 
20 c. RESULTS 

1. IMMUNOSTIMULATORY CAPACITY 
More than 300 fusion clones were tested for their ability to stimulate INF- 7 production. Of these. 80 
clones were initially designated to have some ability to stimulate INF-7 production. Tables 1 and 2 show the data 
obtained for these 80 clones. Clones placed in Table 1 showed the greatest ability to stimulate INF-7 production 
(greater than 10 ng/ml of INF- 7 ) while clones placed in Table 2 stimulated the production of between 2 ng/ml and 
10 ng/ml of INF-7. Background levels of INF-7 production (i.e.. levels produced without any added M. 
tuberculosis antigen) were subtracted from the levels produced by the fusions to obtain the figures shown in these 
tables. 
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TABLE 1 



Irnmunostimulatory AP-fusion clones 



No. 


Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


1 


Acitfl-152 


> 40,000 


-65,000 


-23.400 


-633 


M. avium 

acecolactate synthase 
(98^) 


2 


Acil# 1-247 


>40,000 


-160,000 


-118,400 


-3.198 


peptide synthetase 
(153) 


3 


Acil# 1-264 


>40,000 


-72,500 


-30,900 


-833 


nothing evident 


4 


AciW 1-435 


> 40.000 


-80,000 


-38.400 


- 1 .038 


M. smegmatis 
ethambutol 
resistance gene 
EmbA (624) 



WO 97/00067 



- 7 - 



PCTYUS96/10375 



5. DETERMINATION OF 1MMUNOSTIMULATORY CAPACITY IN MICE 
The purified alkaline phosphatase - Mycobacterium tuberculosis fusion peptides encoded by the 

recombinant clones were then tested for their ability to stimulate INF-7 production in mice. The test used to 

determine INF-7 stimulation is as essentially that described by Orme ei al. (11). 
5 Essentially, the assay method is as follows: The virulent strain M. tuberculosis Erdman is grown in 

Proskauer Beck medium 10 mid-log phase, then aliquoied and frozen at -7<TC for use as an inoculant. Cultures of 

this bacterium are grown and harvested and mice are inoculated with 1 x 10 5 viable bacteria suspended in 200 u\ 

sterile saline via a lateral tail vein on day one of the test. 

Bone marrow-derived macrophages are used in the test to present the bacterial alkaline phosphatase- 
10 Mycobacterium tuberculosis fusion protein antigens. These macrophages are obtained by harvesting cells from 

mouse femurs and culturing the cells in Dulbecco's modified Eagle medium as described by Orme et al. (1 1). 

Eight to ten days later, up to ten ug of the fusion peptide to be tested is added to the macrophages and the cells are 

incubated fur 24 hours. 

The CD4 cells are obtained by harvesting spleen cells from the infected mice and then pooling and 
15 enriching for CD4 cells by removal of adherent cells by incubation on plastic Petri dishes, followed by incubation 
for 60 minutes at 37°C with a mixture of Jlld.2, Lyt-2.43. and GL4 monoclonal antibody (mAb) in the presence 
of rabbit complement to deplete B cells and immature T cells, CDS cells, and 7* cells, respectively. The 
macrophages are overlaid with 10 6 of these CD4 cells and the medium is supplemented with 5 U IL-2 to promote 
continued T cell proliferation and cytokine secretion. After 72 hours, cell supematants are harvested from sets of 
20 triplicate wells and assayed for cytokine content. 

Cytokine levels in harvested supematants are assayed by sandwich EL1SA as described by Orme et at. 

(ID- 

6. DETERMINATION OF IMMUNOSTIMULATORY CAP X CITY IN HUMANS 
The purified alkaline phosphatase - Mycobacterium tuberculosis fusion peptides encoded by the 

25 recombinant clones or by synthetic peptides are tested for their ability to induce INF- 7 production by human T 
cells in the following manner. 

Blood from tuberculin positive people (producing a tuberculin positive skin test) is collected in EDTA 
coated tubes, to prevent clotting. Mononuclear cells are isolated using a modified version of the separation 
procedure provided with the NycoPrep"' 1.077 solution (Nycomed Pharma AS. Oslo. Norway). Briefly, the blood 

30 is diluted in an equal volume of a physiologic solution, such as Hanks Balanced Salt solution (HBSS), and then 

gently layered over top of the Nycoprep solution in a 2 to 1 ratio in 50 ml tubes. The tubes are centrifuged at 800 
x g for 20 minutes and the mononuclear cells are then removed from the interface between the Nycoprep solution 
and the sample layer. The plasma is removed from the top of the tube and filtered through a 0.2 micron filter and 
is then added to the tissue culture media. The mononuclear cells are washed twice: the cells are diluted in a 

35 physiologic solution, such as HBSS or RPMI 1640. and centrifuged at 400 x g for 10 minutes. The mononuclear 
cells are then resuspended to the desired concentration in tissue culture media (RPMI 1640 containing 10% 
autologous serum. Hepes. non-essential amino acids, antibiotics and polymixin B). The mononuclear cells are then 
cultured in 96 well microtitrc plates. 

Peptides or PhoA fusion proteins are then added to individual wells in the 96 well plate, and cells are then 

40 placed in an incubator (37°C. 5% CO,). Samples of the supematants (tissue culture media from the wells 

containing the cells) are collected at various time points (from 3 to 8 days) after the addition of^the peptides or 
PhoA fusion proteins. The immune responsiveness of T cells to the peptides and PhoA fusion proteins is assessed 
by measuring the production of cytokines (including gamma-interferon). 
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TABLE 1 



Immunostimulatory AP-fusion clones 



15 



No. 


Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


21 


Acil#2-1084 


> 20,000 


-73,000 




— QHy 


Sequences within 

tuberculosis 
clone X68281 (96+) 
and M. leprae clone 
B983 (122 + ) 


22 


AciI#3-47 


> 20,000 


-55,000 


-13.400 


-363 


nothing evident 


23 


AciW3-133 


> 20,000 


-55,000 


-13,400 


-363 


nothing evident 


24 


Acil#3-166 


> 20,000 


-48,000 


-6,400 


-174 


no thine evident 


25 


AciW3-167 


> 20,000 


-65,000 


-23,400 


-633 


M. leprae DNA 
sequence within 
region B983 (588*) 


26 


AciI#3-206 


> 20,000 


-65,000 


-23,400 


-633 


M. leprae DNA 
sequence within 
chromosome region 
MD0092 (91) 


27 


HinP#l-31 


14,638 


-46,000 


-4,400 


-120 
• 


M. tuberculosis 19 
kDa lipo-protein 
antigen precursor 
(218) 


28 


HinP# 1-144 


13,546 


-70,000 


-23,900 


-645 


M. leprae DNA 
sequence within 
chromosome region 
B983 (78) 


29 


HinP#l-3 


11,550 


-49,000 


-7,400 


-200 


M. leprae DNA 
sequence within 
chromosome region 
B983 (100*) 


30 


AciW 1-486 


11,416 


-45,000 


-3,400 


-93 


nothing known 


31 


AciWl-426 


11.135 


-47,500 


-5,900 


- 160 


Dipeptide transport 
protein (65) 


32 


AciW2-916 


10,865 


-75,000 


-33.400 


-903 


nothing evident 


Abbreviations: INF: pg/ml of INF-y produced using fusion to stimulate immune T-cells. Fus. MW: Relative 
molecular weight of the fusion protein in Da. TB port.: Estimated amount of fusion attributable to the M, 
tuberculosis protein. Coding: Amount of DNA needed to encode TB portion of fusion proteins (in base 
pairs). Similarity: Amino acid sequence similarity seen by analysis of DNA via the BLASTX or TBLASTX* 
programs. Scores for alignments are indicated in (). Due to the high G + C nature of M. TB DNA many false 
positives are evident. Only scores above 100 have good credibility. 
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TABLE 1 



Immunostimulatory AP-fusion clones 



No, 


Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


5 


HinP#l-27 


> 20,000 


59,000 


17,400 


471 


nothing evident 


6 


HinP#2-92 


> 20,000 


74,600 


33,000 


891 


1. M. tuberculosis 
ORF 

MTCY190.1 lC 
(1794 + ) 

2. Cytochrome C 
oxidase subunit II 
(I4l) 


7 


HinP#2-145 


> 20,000 


60,000 


13,900 


375 


nothing evident 


8 


HinP#2-150 


> 20.000 


55,000 


13.400 


362 


nothing evident 


V 


ninfff l -zuu 


^> in (Y\o 
zu,uuu 




1 1,900 


321 


nothing evident 


10 


Hinrff J-JU 


> zu,uuu 




*>7 400 


740 


KA Ipn rap 
chromosome 
sequence in B983 
region (28 1 + ) 


11 


AciI#2-2 


> 20,000 


70,000 


28,400 


768 


M. leprae 
chromosome 
sequence within 
region B1529 (139) 


12 


AciW2-23 


> 20,000 


75,000 


33,400 


903 


Region within 
sequence MD0009 
of the M. leprae 
chromosome 


13 


AciI#2-506 


> 20,000 


60,000 


18,400 


498 


nothing evident 


14 


AciW2-511 


> 20,000 


-60,000 


-18,400 


-498 


nothing evident 


15 


AciW2-639 


> 20,000 


-60,000 


-18,400 


-498 


nothing evident 


16 


AciI#2-822 


> 20,000 


-45,000 


-3,400 


-93 


M. tuberculosis 
sequence within 
region MD0074 
(U27357) (55 P) 


17 


AciI*2-823 


> 20,000 


-46,500 


-4,900 


-132 


nothing evident 


18 


AciI#2-825 


> 20,000 


-150,000 


-110,000 


-2,970 


M. tuberculosis 
sequence 

MTCY3 1.03c (431) 


19 


AciI#2-827 


> 20,000 


-48.000 


-6.400 


- 174 


cytochrome d 
oxidase 


20 


AciW2-898 


> 20,000 


-49.000 


-7.400 


-201 


nothing evident 
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TABLE 2 



Immunostimulatory AP-fusion clones (cont'd) 



5 



10 



No. 


Clone 
Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


20 


AciW2-1035 


3.454 


-46,000 


-4,400 


-120 


nothing evident 


21 


Acil#2-1089 


8,974 


-65,000 


-23,400 


-633 


Similar to M. 
tuberculosis sequence 
X75361 and sequence 
in M. bovis MD0057 
and U34849 regions. 
Immunogenic proteins 
MPB64 and MPT 64 
are homologous. 


22 


Acil#2-1090 


7,449 


-65.000 


-23.400 


-633 


nothing evident 


23 


Acil#2-1104 


5,148 


-68,000 


-26,400 


-714 


Similar to M. 
luuercuiosis sequence 
X80268 and to cds 1 
(256) in Af. leprae 
sequence region 
MD0045 (169*); 
secreted antigenic 
protein. 


24 


AciI#3-9 


3,160 


-67,000 


-25.400 


-687 


nothing evident 


25 


AciW3-12 


3,891 


-75,000 


-33,400 


-903 


Penicillin binding 
protein; similar to Af. 
leprae sequence within 
genomic clone B1529 


26 


Acil#3-15 


4,019 


-65,000 


-23,400 


-633 


nothing evident 


27 


Acil#3-21 


2,301 


-69,000 


-27.400 


-741 


nothing evident 


28 


AciI3-78 


2,905 


-65,000 


-23.400 


-633 


Similar to sequence 
within M. leprae 
genomic clone B983 


29 


AciW3-134 


3,895 


-45,000 


-3.400 


-93 


nothing evident 


30 


AciW3-204 


4,774 


-60,000 


-13,900 


-375 


nothing evident 


31 


Acil#3-214 


7,333 


-50,000 


8,400 


-228 


nothing evident 


32 


AciI#3-243 


2,857 


-65,000 


-23,400 


-633 


nothing evident 


33 


Acil#3-281 


2,943 


-65,000 


-23,400 


-633 


Similar to sequence 
within M. leprae 
genomic clone B983 


34 


Bsa HWl-21 


8,122 


-90.000 


-48,400 


- 1 .209 


nothing evident 


35 


HinP#l-12 


2,905 


-66.000 


-24,400 


-660 


possible tyrosine 
phosphatase 
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TABLE 2 

Iramunostimulatory AP-fusion clones (cont'd) 

5 



iNO. 


one 
Name 


in r 


riu i'i " 


TRnorr 


coding 




1 


AciI#l-62 


3.126 


-43.000 


-1,400 


-39 


M. tuberculosis MTCY 
190. 11C cytochrome C 
oxidase subunit 11 
(198) 

M. leprae sequence in 
B1551 region (1087*) 


X 


ACUff Z- 14 


A on 7 


_J ^ AAA 


_ *l ACV) 
— J ,HUv 


— Q T 


UUlllLUg CVlUClll 


1 

J 










— 822 


nrtfhino pviHf*nf 


4 


AciI#2-35 


3,907 


-45,000 


-3.400 


-93 


Possibly similar to M. 
leprae sequence in the 
B983 region (116*) 


5 


Acil#2-147 


5,464 








nothing evident 


6 


AciI#2-508 


7,052 


-70,000 


-28,400 


-768 


Similar to sequence of 
the M Ipnrap ORF 
encoding gp U00018 
(125) and similar to 
sequence in the B2168 
c2-209 region of M. 

Ipnrnp opnnmp (H^*} 


7 


Acil#2-510 


2,445 


-69.000 


-27,400 


-741 


nothing evident 


8 


AciI#2-523 


2,479 


-50,000 


-8,400 


-228 


Similar to M. 
tuberculosis sequence 
z70692 from clone 
Y427 (96) 


9 


Acil#2-676 


3,651 


-70.000 


-28,400 


-768 


Similar to AciI#2-639 


10 


AciW2-834 


5.942 


-60.000 


- 13,900 


-375 


nothing evident 


11 


AciI#2-854 


5,560 


-44,000 


-2,400 


-66 


nothing evident 


12 


AciI#2-872 


2.361 


-47,000 


-5,400 


-147 


nothing evident 


13 


AciW2-874 


2,171 


-45,000 


-3,400 


-93 


nothing evident 


14 


AciW2-8841 


2,729 


-85.000 


-43,400 


-1173 


Isocitrate . 

dehydrogenase (247) 


15 


AciW2-894 


3,396 


-70.000 


-28.400 


-768 


nothing evident 


16 


AciW2-I0l4 


6,302 


-45.000 


-3,400 


-93 


nothing evident 


17 


Acil#2-I018 


4,642 


-55,000 


- 13,400 


-363 


nothing evident 


18 


Acil#2-1025 


3,582 


-45,000 


-3,400 


-93 


nothing evident 


19 


Acil#2-1034 


2,736 


-80,000 


-38.400 


-103 


nothing evident 
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TABLE 2 

Immunostimulatory AP- fusion clones (cont'd) 



No. 


Clone 
Name 


INF 


Fus-MW 


TBpon 


coding 


Similarity (score) 


46 


HpaII#l-8 


2,048 


110,000 


68.400 


-1,848 


nothing evident 


47 


Hpalttl-IO 


4,178 


55,000 


13,400 


-633 


Similar to 

immunogenic proteins 
MPB64/MPT64 


48 


HpaII#l-13 


3.714 


43,000 


1,400 


-39 


nothing evident 


Abbreviations: INF: pg/mi of INF-7 produced using fusion 10 stimulate immune T-cells. Fus. MW- Relative 
molecular weight of the fusion protein. TB port.: Estimated amount of fusion attributable to the M 
tuberculosis protein. Coding: Amount of DNA needed 10 encode TB portion of fusion proteins Similarity- 
Ammo acid sequence similarity seen by analysis of DNA via the BLASTX or TBLASTX* programs Scores 
for alignments are indicated in (). Due to the high G + C nature of M. TB DNA many false positives are 
evident. Only scores above 100 have good credibility. 



10 



15 



20 



2. DNA SEQUENCING AND DETERMINATION OF OPEN READING FRAMES 

DNA sequence data for the sequences of the Mycobacterium tuberculosis DNA present in the clones shown 
in Tables 1 and 2 are shown in the accompanying Sequence Listing. The sequences are believed to represent the 
coding strand of the Mycobacterium DNA. In most instances, these sequences represent only partial sequences of 
the immunostimulatory peptides and. in turn, only partial sequences of Mycobacterium tuberculosis genes. 
However, each of the clones from which these sequences were derived encodes, by itself, at least one 
immunostimulatory T-cell epitope. As discussed in pan V below, one of ordinary skill in the an will, given the 
information provided herein, readily be able to obtain the 'immunostimulatory peptides and corresponding full 
length M. tuberculosis genes using standard techniques. Accordingly, the nucleotide sequences of the present 
inveniion encompass not only those sequences presented in the sequence listings, but also the complete nucleotide 
sequence encoding the immunostimulatory peptides as well as the corresponding M. tuberculosis genes. The 
nucleotide abbreviations employed in the sequence listings are as follows in Table 3: 
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TABLE 2 



lmmunostimulatory AP-fusion clones (cont'd) 



No. 


Clone 
Name 


INF 


Fus-MW 


TBport 


coding 


Similarity (score) 


36 


HinP#2-23 


2,339 


-43.000 


-1,400 


-39 


Similar to sequence in 
M. leprae genomic 
clone MD0009-0-<B13) 
(354) 


37 


HinP# 1-142 


6.258 


-69,000 


-27,400 


-741 


nothing evident 


33 


HinP#24 


6,567 


-66,000 


-24,400 


-660 


nothing evident 


39 


HinP#2-143 


3,689 


-65.000 


-23,400 


-633 


Similar to sequence in 
M. leprae genomic 
clone B1529 


40 


HinP#2-145A 


2,314 


-64,000 


-22,400 


-606 


nothing evident 


41 


HinPff2-I47 


7,021 


/LC AAA 

Oj,UUU 


oi Ann 




IlVJUllllg CV 111 till 


42 


HinP#3-28 


2,980 


70,000 


28,400 


-768 


Similar to M. leprae 
sequence in genomic 
clones MD0085 and 
sequence for M. leprae 
gp U00013 cds 27 of 
B1496 region 


43 


HinP#3-34 


2.564 


71,000 


29,400 


-795 


Similar to sequence in 
M. leprae genomic 
clone B2168 (U00018 
cds 9) 


44 


HinP#3-41 


3,296 


48,000 


6,400 


-1,728 


Similar to antigen 85 
complex protein 
subunit 


45 


HpaII#l-3 


2,360 


65,000 


23,400 


-633 


Cytochrome C oxidase 
subunit 11 (156) 
Similar to M. 
tuberculosis sequence 
on clone MTCY 
190.11c 
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Peptides designed from sequences described in this application include: 

Hin P#l-200 (6 peptides) 

Peptide Sequence Peptide Name 

VKLATGMAE TV AS FS P S HPI1-2 00/2 

REWHLATGMAETVASF HPI1-200/3 

RDSREWHLATGMAETV HPI1-200/4 

DFNRDSREWHLATGMA HPI1-200/5 

ISAAWTGYLRWTTPDR HPI1-200/6 

AWFLCAAAI SAAWTG HP 11-200/7 

AciI#2-827 (14 peptides) 

Peptide Sequence Peptide Name 

VTDNPAWYRLTKFFGKL CD -2/1/96/1 

15 AWYRLTKFFGKLFLINF CD- 2/1/ 96/2 

KFFGKLFLINFAIGVAT CD -2/1/96/3 

FLINFAIGVATGIVQEF CD- 2/1/96/4 

AI GVATG I VQE FQFGMN CD -2/1/96/5 

TGIVQEFE FGMNWS EYS CD -2/1/96/6 

20 E FQ FGMNWS EYS RFVGD CD-2/1/96/7 

MNWS E YSRFVGDVFGAP CD -2/1/96/8 

WSEYSRFVGDVFGAPLA CD-2/1/96/9 

EYSRFVGDVFGAPLAME CD-2/1/96/10 

SRFVGDVFGAPLAMESL CD -2/1/96/11 

25 WIFGWNRLPRLVHLACI CD- 2/1/96/12 

WNRLPRL VHLAC I W IVA CD -2/1/96/13 

GRAELS S I WLLTNNTA CD - 2/1/96/14 



30 



HinP#l-3 (2 peptides) 

Peptide Sequence Peptide Name 

GKTYDAYFTDAGG I TPG HP 1 1 - 3 1 2 

YDAYFTDAGG ITPGNS V HP I 1 - 3 /3 

35 HinP#l-3 / HinP#l-200 combined peptides 

Peptide Sequences Peptide Name 

WPQGKTYDAYFTDAGG I (HinP#l-3) HPI1-3/1 (combined) 

ATGMAETVASFSPSEGS (HinP#l-200) 



40 



45 



AciI#2-823 (1 peptide) 

Peptide Sequence Peptide Name 

GWERRLRHAVS PKDPAQ AI2-823/1 

HinP#l-31 (4 peptides) 



Peptide Sequence Peptide Name 

TGSGETTTAAGTTASPG HP 11-31/1 

50 GAAILVAGLSGCSSNKS HP 11-31/2 

AVAGAAI LVAGLSGCS S HPI1-31/3 

LTVAVAGAAI LVAGLSG HPIl-31/4 

These synihetic peptides were resuspended in phosphate buffered saline to be tested to confirm their ability 
55 to function as T cell epitopes using the procedure described in pan IV(B)(6) above. 

5. CONFIRMATION OF IMMUNOSTIMULATORY CAPACITY USING T CELLS 
FROM TUBERCULOSIS PATIENTS 

The synthetic peptides described above, along with a number of the PhoA fusion proteins shown to be 

immunostimulatory in mice were tested for their ability to stimulate gamma interferon production in T-cells from 

60 tuberculin positive people using the methods described in pan IV(B){6) above. For each assay, 5 x I0 5 

mononuclear cells were stimulated with up to 1 u%lrv\ M. tuberculosis peptide or up to 50 ng/ml Pho A fusion 

protein. M. tuberculosis filtrate proteins. Con A and PHA were employed as positive controls. An assay was run 

with media alone to determine background levels, and Pho A protein was employed as a negative control. 
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Symbol 


Meaning 


A 


A; adenine 


C 


C: cytosine 


G 


G; guanine 


T 


T; thymine 


U 


U; uracil 


M 


AorC 


R 


A or G 


W 


A or T/U 


S 


CorG 


Y 


C or T/U 


K 


G or T/U 


V 


A or C or G; noi T/U 


H 


A or C or T/U; not G 


D 


A or G or T/U: not C 


B 


C or G or T/U; not A 


N 


(A or C or G or T/U) or (unknown or other or no 
base) 

indeterminate* 



The DNA sequences obtained were then analyzed with respect to the G+C content as a function of codon 
25 position over a window of 120 codons using the 'FRAME' computer program (Bibb, M.J.; Findlay, P.R.; and 

Johnson. M.W.; Gene 30: 157-166 (1984)). This program uses the bias of these nucleotides for each of the codon 
positions to enable the correct reading frame to be identified. 

3. IDENTIFICATION OF T CELL EPITOPES IN THE IMMUNOSTIMULATORY 
PEPTIDES 

30 The T-Site program, by Feller. D.C. and de la Cruz. V.F.. Medlmmune Inc., 19 Firstfield Rd.. 

Gaithersburg. M.D. 20878. U.S.A.. was used to predict T-cell epitopes from the determined coding sequences. U 
uses a series of four predictive algorithms. In particular, peptides were designed against regions indicated by the 
algorithm "A* motif which predicted alpha-helical periodicity (Margalit. H.: Spouge. J.L.; Comette. J.L.; Cease. 
K.B.; DeLisi, C.; and Berzofsky. J. A.. J. Immunol. . 138:2213 (1987)) and amphipaihicity and those indicated by 

35 the algorithm "R" motif which identifies segments which display similarity to motifs known to be recognized by 
MHC class I and class II molecules (Rothbard. J.B. and Taylor. W.R.. EMBOJ. 7:93 (1988)). The other two 
algorithms identify classes of T-cell epitopes recognized in mice. 

4. SYNTHESIS OF SYNTHETIC PEPTIDES CONTAINING T CELL EPITOPES IN 
IDENTIFIED IMMUNOSTIMULATORY PEPTIDES 

40 A series of staggered peptides were designed to overlap regions indicated by the T-site analysis. These 

were synthesized by Chiron Mimotopes Pty. Ltd. (11055 Roselle St.. San Diego. CA 92121. U.S.A.). 
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An alternative approach to cloning the full length ORFs corresponding to the DNA sequences provided 
herein is the use of the polymerase chain reaction (PCR). In particular, the inverse polymerase chain reaction 
(IPCR) is useful to isolate DNA sequences flanking a known sequence. Methods for amplification of flanking 
sequences by IPCR are described in Chapter 27 of reference no. 17 and in reference no. 23. 

Accordingly, one aspect of the present invention is small oligonucleotides encompassed by the DNA 
sequences presented in the Sequence Listing. These small oligonucleotides are useful as hybridization probes and 
PCR primers that can be employed to clone the corresponding full length Mycobacterium tuberculosis ORFs. In 
preferred embodiments, these oligonucleotides will comprise at least 15 contiguous nucleotides of a DNA sequence 
set forth in the Sequence Listing, and in more preferred embodiments, such oligonucleotides will comprise at least 
20 contiguous nucleotides of a DNA sequence set forth in the Sequence Listing. 

One skilled in the art will appreciate that hybridization probes and PCR primers are not required to exactly 

match the target gene sequence to which they anneal. Therefore, in another embodiment, the oligonucleotides will 

comprise a sequence of at least 15 nucleotides and preferably at least 20 nucleotides, the oligonucleotide sequence 

being substantially similar to a DNA sequence set forth in the Sequence Listing. Preferably, such oligonucleotides 

will share at least about 75%-90% sequence identity with a DNA sequence set forth in the Sequence Listing and 

more preferably the shared sequence identity will be greater than 90%. 

B. EXAMPLE - CLONING OF THE FULL LENGTH ORF CORRESPONDING TO CLONE HinP 
#2-92 

Using the techniques described below, the full length gene corresponding to the clone HinP #2-92 was 
obtained. This gene, herein termed mxb2-92 includes an open-reading frame of 1089 bp (identified based on the 
G + C content relating to codon position). The alternative 'GTC start codon was used, and this was preceded 
(8 bps upstream) by a Shine-Dalgarno motif. The gene rmb2-92 encoded a protein (termed MTB2-92) containing 
363 amino acid residues with a predicted molecular weight of 40.436.4 Da. 

Sequence homology comparisons of the predicted amino acid sequence of MTB2-92 with known proteins in 
the database indicated similarity to the cytochrome c oxidase subunit II of many different organisms. This integral 
membrane protein is pan of the electron transport chain, subunits 1 and II forming the functional core of the 
enzyme complex. 

1. CLONING THE FULL LENGTH GENE CORRESPONDING TO HinP #2-92 
The plasmid pHin2-92 was restricted with either BamHl or EcoRI and then subcloned into the vector M13. 
The insert DNA fragments were sequenced under the direction of Ml 3 universal sequencing primers (Yanisch- 
Perron, C. et al.. 1985) using the AmpliCycle thermal sequencing kit (Pcrkin Elmer, Applied Biosystems Division. 
850 Lincoln Centre Drive. Foster City, CA 94404, U.S.A.). The 5-panial MTB2-92 DNA sequence was aligned 
using a GeneWorks (Intelligcnetics. Mountain View. CA. U.S.A.) program. Based on the sequence data obtained, 
two oligomers were synthesized. These oligonucleotides ( 5 CCCAGCTTGTGATACAGGAGG 3 
5 GGCCTCAGCGCGGCTCCGGAGG 3 ) represented sequences upstream and downstream, over an 0.8 kb distance, 
of the sequence encoding the partial MTB2-92 protein in the alkaline phosphatase fusion. 

A Mycobacterium tuberculosis genomic cosmid DNA library was screened using PCR (Sambrook. J. et al.. 
1989) in order to obtain the full-length gene encoding the MTB2-92 protein. Two hundred and ninety-four 
bacterial colonies containing the cosmid library were pooled into 10 groups in 100 n\ distilled water aliquots and 
boiled for 5 min. The samples were spun in a microfuge at maximal speed for 5 min. The supernatants were 
decanted and stored on ice prior to PCR analysis. 

The 100 jil-PCR reaction contained: 10 u\ supernatant containing cosmid DNA. 10 u\ of I OX PCR buffer. 
250 fiM dNTP's. 300 nM downstream and upstream primers. I unit Tag DNA polymerase. 
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The results, shown in Table 4 below, indicate that all of the peptides tested stimulated gamma interferon 
production from T-cells of a particular subject. 



TABLE 4 



Peptide or Pho A 
Fusion Protein Name 


Concentration of 
Interferon-gamma 
(pg/ml) 


Concentration of 
Interferon-gamma 
minus* background 
(pg/ml) 


CD-2/1/9671 


256.6 


153.3 


CD-2/1/96/9 


187.6 


84.3 


CD-2/1/96/10 


134.0 


30.7 


CD-2/1/96/11 


141.6 


38.3 


CD-2/1/96/14 


310.2 


206.9 


HPI 1-3/2 


136.3 


23.0 


HPI1-3/3 


264.2 


160.9 


Acil 2-898 


134.0 


30.7 


Acil 3^47 


386.8 


283.5 


M. tuberculosis filtrate proteins 
(10 M g/ml) 


256.6 


153.3 


M. tuberculosis filtrate proteins (5 
Mg/ml) 


134.0 


30.7 


Con A (10 Mg/ml) 


2 839 


2 735.7 


PHA (1%) 


10 378 


10 274.7' 


Pho A control 
(10 ug/m\) 


26.. 7 


0 


Background 


103.3 


0 



25 V. CLONING OF FULL LENGTH SfYCOBACTERlUSf TUBERCULOSIS T-CELL EPITOPE ORFS 

Most the sequences presented represent only pan of a larger M. tuberculosis ORF. If desired, the full 
length Af tuberculosis ORFs that include these provided nucleotide sequences can be readily obtained by one of 
ordinary skill in the an. based on the sequence data provided herein. 
A. GENERAL METHODOLOGIES 

30 Methods for obtaining full length genes based on panial sequence information are standard in the an and 

are panicularly simple for prokaryotic genomes. By way of example, the full length ORFs corresponding to the 
DNA sequences presented herein may be obtained by creating a library of Mycobacterium tuberculosis DNA in a 
plasmid. bacteriophage or phagemid vector and screening this library with a hybridization probe using standard 
colony hybridization techniques. The hybridization probe consists of an oligonucleotide derived from a DNA 

35 sequence according to the present invention labelled with a suitable marker to enable detection of hybridizing 
clones. Suitable markers include radionuclides, such as P-32 and non-radioactive markers, such as biotin. 
Methods for constructing suitable libraries, production and labelling of oligonucleotide probes and colony 
hybridization are standard laboratory procedures and are described in standard laboratory manuafs such as in 
reference nos. 15 and 16. 

40 Having identified a clone that hybridizes with the oligonucleotide, the clone is identified and sequenced 

using standard methods such as described in Chapter 13 of reference no. 15. Determination of the translation 
initiation point of the DNA sequence enables the ORF to be located. 
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At each stage of the protein purification, a sample was analysed by SDS polyacyiamide gei electrophoresis 
(Laemmii, U.S. (1970) Nature (London), 227:680-685) (see Fig. 2). 
C CORRECTION OF SEQUENCE ERRORS 

It is noted that some of the sequences presented in the Sequence Listing contain sequence ambiguities. 
Naturally, in order to ensure that the immunostimulatory function is maintained, one would utilize a sequence 
without such ambiguities. For those sequences containing ambiguities, one would therefore utilize the sequence 
data provided in the Sequence Listing to design primers corresponding to each terminal of the provided sequence 
and. using these primers in conjunction with the polymerase chain reaction, synthesize the desired DNA molecule 
using M. tuberculosis genomic DNA as a template. Standard PCR methodologies, such as those described above, 
may be used to accomplish this. 

VI. EXPRESSION AND PURIFICATION OF THE CLONED PEPTIDES 

Having provided herein DNA sequences encoding Mycobacterium tuberculosis peptides having an 
immunostimulatory activity, as well as the corresponding full length Mycobacterium tuberculosis genes, one of skill 
in the art will be able to express and purify the peptides encoded by these sequences. Methods for expressing 
proteins by recombinant means in compatible prokaryotic or eukaryoiic host cells are well known in the art and are 
discussed, for example, in reference nos. 15 and 16. Peptides expressed by the nucleotide sequences disclosed 
herein are useful for preparing vaccines effective against M. tuberculosis infection, for use in diagnostic assays and 
for raising antibodies that specifically recognize M. tuberculosis proteins. One method of purifying the peptides is 
that presented in pan V(B) above. 

The most commonly used prokaryotic host cells for expressing prokaryotic peptides are strains of 
Escherichia coli. although other prokaryotes. such as Bacillus subtilis Streptomyces or Pseudomonas may also be 
used, as is well known in the art. Partial or full-length DNA sequences, encoding an immunostimulatory peptide 
according to the present invention, may be ligated into bacterial expression vectors.' One aspect of the present 
invention is thus a recombinant DNA vector including a nucleic acid molecule provided by the present invention. 
25 Another aspect is a transformed cell containing such a vector. 

Methods for expressing large amounts of protein from a cloned gene introduced into Escherichia coli 
(E. coli) may be utilized for the purification of the Mycobacterium tuberculosis peptides. Methods and plasmid 
vectors for producing fusion proteins and intact native proteins in bacteria are described in reference no. 15 
(ch. 17). Such fusion proteins may be made in large amounts, are relatively simple to purify, and can be used to 
produce antibodies. Native proteins can be produced in bacteria by placing a strong, regulated promoter and an 
efficient ribosome binding site upstream of the cloned gene. If low levels of protein are produced, additional steps 
may be taken to increase protein production; if high levels of protein are produced, purification is relatively easy. 

Often, proteins expressed at high levels are found in insoluble inclusion bodies. Methods for extracting 
proteins from these aggregates are described in ch. 17 of reference no. 15. Vector systems suitable for the 
expression of lacZ fusion genes include the pUR series of vectors (24). pEXl-3 (25) and pMRlOfj (26). Vectors 
suitable for the production of intact native proteins include pKC30 (27), p fCKl77-3 (28) and pET-3 (29). Fusion 
proteins may be isolated from protein gels, lyophilized. ground into a powder and used as antigen preparations. 

Mammalian or other cukaryotic host cells, such as those of yeast, filamentous fungi, plant, insect, 
amphibian or avian species, may also be used for protein expression, as is well known in the art. Examples of 
40 commonly used mammalian host cell lines are VERO and HeLa cells. Chinese hamster ovary (CHO) cells, and 
WI38. BHK. and COS cell lines, although it will be appreciated by the skilled practitioner that other prokaryotic 
and eukaryotic cells and cell lines may be appropriate for a variety of purposes, e.g.. to provide higher expression, 
desirable glycosylation patterns, or other features. 
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The reactions were heated at 95°C for 2 min and then 40 cycles of DNA synthesis were performed (95°C 
for 30 s. 65*C for 1 min, 72°C for 2 min). The PCR products were loaded into a 1% agarose gel in TAE buffer 
(Sambrook, J. et at.. 1989) for analysis. 

The supernatant, which produced 800 bp PCR products, was then further divided into 10 samples and the 
5 PCR reactions were performed again. The colony which had resulted in the correctly sized PCR product was then 
picked. The cosmid DNA from the positive clone (pG3) was prepared using the Wizard Mim-Prep Kit (Promega 
Corp, Madison, WI, U.S.A.). The cosmid DNA was further sequenced using specific oligonucleotide primers. 
The deduced amino acid sequence encoded by the MTB2-92 protein is shown in Fig. 1. 

2. EXPRESSION OF THE FULL LENGTH GENE 

10 To conveniently purify the recombinant protein, a histidine tag coding sequence was engineered 

immediately upstream of the sian codon of mib2-92 using PCR. Two unique restriction enzyme sites for Xbal and 
HindlU were added to both ends of the PCR product for convenient subcioning. Two oligomers were used to 
direct the PCK reaction: (' ICTAGACACCACCACCACCACCACGTGACACCTCGCGGGCCAGGTC r and 
5 AAGCTTCGCCATGCCGCCGGTAAGCGCC ) ) 

15 ^ 100 ^ PCR reaction contained: 1 Mg pG3 template DNA, 250 M M dNTP's. 300 nM of each primer. 

10 y\ of 10X PCR buffer. 1 unit Taq DNA polymerase. The PCR DNA synthesis cycle was performed as above. 

The 1.4 kb PCR products were purified and ligated into the cloning vector pGEM-T (Promega). Inserts 
were removed by digestion using both the Xbal and HindUl and the 1.4 kb fragment was directionally subcloned 
into the Xbal and HindlU sites of P MAL-c2 vector (New England Bio-Labs Ltd., 3397 American Drive. Unit 12, 

20 Mississauga, Ontario, L4V ITS. Canada). The gene encoding MTB2-92 was fused, in frame, downstream of the 
maltose binding protein (MBP). This expression vector was named pMAL-MTB2-92. 

3. PURIFICATION OF THE ENCODED PROTEIN 

The plasmid pMAL-MTB2-92 was transformed into competent £. coti JM109 cells and a 1 litre culture 
was grown up in LB broth at 37°C to an OD S30 of 0.5 to 0.6. The expression of the gene was induced by the 
25 addition of IPTG (0.5 mM) to the culture medium, after which the culture was grown for another 3 hours at 37°C 
with vigorous shaking. Cultures were spun in the centrifuge at 10,000 g for 30 min and the cell pellet was 
harvested. This was re-suspended in 50 ml of 20 mM Tris-HCl. pH 7.2. 200 mM NaCI. 1 mM EDTA 
supplemented with 10 mM 6 mercaptoethanol and stored at — 20°C. 

The frozen bacteria] suspension was thawed in cold water (0°C). placed in an ice bath, and sonicated. The 
30 resulting cell lysate was then centnfuged at 10.000 g and 4°C for 30 min, the supernatant retained, diluted with 

5 volumes of buffer A and applied to an amylose-resin column (New England Bio-Labs Ltd., 3397 American 
Drive, Unit 12, Mississauga. Ontario. L4V 1T8. Canada) which had been pre-equilibrated with buffer A. The 
column was then washed with buffer A until the eluate reached an A^ of 0.001 at which point, the bound MBP- 
MTB2-92 fusion protein was eluted with buffer A containing 10 mM maltose. The protein purified by the 

35 amylose-resin affinity column was about 84 kDa which corresponded to the expected size of the fusion protein 
(MBP: 42 kDa. MTB2-92 plus the histidine tag: 42 kDa). 

The eluted MBP-MTB2-92 fusion protein was then cleaved with factor Xa to remove the MBP from the 
MTB2-92 protein. One ml of fusion protein (1 mg/ml) was mixed with 100 *d of factor Xa (200 Mg'ml) and kept 
at room temperature overnight. The mixture was diluted with 10 ml of buffer B (5 mM imidazole. 0 5 M NaCI. 

40 20 mM Tris-HCl. pH 7.9. 6 M urea) and urea was added to the sample to a final concentration of 6 M urea. The 
sample was loaded onto the Ni-NTA column (QIAGEN. 9600 De Soto Ave., Chatsworth. CA 9tfll. U.S.A.) pre- 
equilibrated with buffer B. The column was washed with 10 volumes of buffer B and 6 volumes of buffer C 
(60 mM imidazole . 0.5 M NaCI, 20 mM Tris-HCl. pH 7.9. 6 M urea). The bound protein was eluted with 

6 volumes of buffer D (1 M imidazole. 0.5 M NaCI, 20 mM Tris-HCl. pH 7.9. 6 M urea). 
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TABLE 5 
The Genetic Code 



Third 

Position Position 
(5' end) Second Position (3' end) 



T 


C 


A 


G 




Phe 


Ser 


Tyr 


Cys 


T 


Phe 


Ser 


Tyr 


Cys 


C 


Leu 


Ser 


Stop (och) 


Stop 


A 



Leu Ser Stop (amb) Trp I G 







Leu 


Pro 


His 


Arg 


1 T 


20 




Leu 


Pro 


His 


Arg 


C 




C 


Leu 


Pro 


Gin 


Arg 


A 






_Leu 


Pro 


Gin 


Arg 


I G 


25 


















lie 


Thr 


Asn 


Ser 


T 






He 


Thr 


Asn 


Ser 


C 




A 


He 


Thr 


Lys 


Arg 


A 


30 




Met 


Thr 


Lys 


Arg 





35 G 



|Val 


Ala 


Asp 


Gly 


T 


Val 


Ala 


Asp 


Gly 


C 


Val 


Ala 


Glu 


Gly 


A 


|Val (Met) 


Ala 


Glu 


Gly 





"Stop (och) " stands for the ocre termination triplet, and 
"Stop (amb) " for the amber. ATG is the most common 
initiator codon ; GTG usually codes for valine, but it can 
also code for methionine to initiate an mRNA chain. 
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VII. SEQUENCE VARIANTS 

It will be apparent to one skilled in the art that the immunostimulatory activity of the peptides encoded by 
the DNA sequences disclosed herein lies not in the precise nucleotide sequence of the DNA sequences, but rather 
in ihe epitopes inherent in the amino acid sequences encoded by the DNA sequences. It will therefore also be 
5 apparent that it is possible to recreate the immunostimulatory activity of one of these peptides by recreating the 
epitope, without necessarily recreating the exact DNA sequence. This could be achieved either by directly 
synthesizing the peptide (thereby circumventing the need to use the DNA sequences) or. alternatively, by designing 
a nucleic acid sequence that encodes for the epitope, but which differs, by reason of the redundancy of the genetic 
code, from the sequences disclosed herein. 

10 Accordingly, the degeneracy of the genetic code further widens the scope of the present invention as it 

enables major variations in the nucleotide sequence of a DNA molecule while maintaining the amino acid sequence 
of the encoded protein. The genetic code and variations in nucleotide codons for particular amino acids is 
presented in Tables 5 and 6. Based upon the degeneracy of the genetic code, variant DNA molecules may be 
derived from the DNA sequences disclosed herein using standard DNA mutagenesis techniques, or by synthesis of 

15 DNA sequences. 
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TABLE 7 



15 



25 



Original Residue Conservative Substitutions 



Ala 



io Asn 



Cys 
Gin 
Glu 
Gly 



Thr 



ser 



A rg l ys 



gin, his 



Asp g i u 



ser 
asn 
asp 
pro 



His asn; gin 

He leu, val 

Le u ile; val 

L ys arg; gin; glu 
20 Met ieu ; ile 

phe met; leu; tyr 
Ser thr 



ser 



Trp tyr 
TV* trp; phe 

Val He; leu 



30 



35 



Substantial changes in immunological identity are made by selecting substitutions that are less conservative 
than those in Table 7, i.e.. selecting residues that differ more significantly in their effect on maintaining (a) the 
structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical 
conformation, (b) the charge or hydrophobiciry of the molecule at the target site, or (c) the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in protein properties will be those 
in which (a) a hydrophilic residue, e.g.. seryl or threonyl. is substituted for (or by) a hydrophobic residue, e.g., 
leucyl. isoleucyl. phcnylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; 
(c) a residue having an electropositive side chain, e.g.. lysyl. arginyl. or histadyl. is substituted for (or by) an 
electronegative residue, e.g.. glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g.. phenylalanine, 
is substituted for (or by) one not having a side chain, e.g.. glycine. However, such variants must retain the ability 
to stimulate INF-7 production. 

VIII. USE OF CLONED MYCOBACTERIUM SEQUENCES TO PRODUCE VACCINES 

The purified peptides encoded by the nucleotide sequences of the present invention may be used directly as 
immunogens for vaccination. The conventional tuberculosis vaccine is the BCG (bacille Calmeue-Guehn) vaccine, 
which is a live vaccine comprising attenuated Mycobacterium bous bacteria. However, the use of this vaccine in a 
number of countries, including the U.S.. has been limited because administration of the vaccine interferes with the 
45 use of the tuberculin skin test to detect infected individuals (see Cecil Textbook of Medicine (Ref~ 33). pages 1733- 
1742 and section VIII (2) below). 

The present invention provides a possible solution to the problems inherent in the use of the BCG vaccine 
in conjunction with the tuberculin skin test. The solution proposed is based upon the use of one or more of the 
immunostimulatory M. tuberculosis peptides disclosed herein as a vaccine and one or more different 
immunostimulatory M. tuberculosis peptides disclosed herein in the tuberculosis skin test (see section IX (2) 
below). If ih e immune system is primed with such a vaccine, it will be able to resist an infection by M. 
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TABLE 6 

The Degeneracy of the Genetic Code 



10 



Number of 

Synonymous 

Codons 



Amino Acid 



Total 
Number of 
Codons 



15 



20 



6 
4 
3 

2 



Total number of codoi 
Number of codons for termination 
Total number of codons in genetic code 



Leu, 


Ser , 


Arg 




18 


Gly, 


Pro, 


Ala, Val, 


Thr 


20 


He 








3 


Phe, 
Glu, 


Tyr, 
Asn, 


Cys, His, 
Asp, Lys 


Gin, 


18 


Met, 


Trp 






2 


for 


amino 


acids 




61 



3 
64 



Additionally, standard mutagenesis techniques may be used to produce peptides which vary in amino acid 
25 sequence from the peptides encoded by the DNA molecules disclosed herein. However, such peptides will retain 
the essential characteristic of the peptides encoded by the DNA molecules disclosed herein, i.e. the ability to 
stimulate INF-7 production. This characteristic can readily be determined by the assay technique described above. 
Such variant peptides include those with variations in amino acid sequence including minor deletions, additions and 
substitutions. 

30 While the site for introducing an amino acid sequence variation is predetermined, the mutation per st need 

not be predetermined. For example, in order 10 optimize the performance of a mutation at a given site, random 
mutagenesis may be conducted at the target codon or region and the expressed protein variants screened for the 
optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in 
DNA having a known sequence as described above are well known. 

35 in order to maintain the functional epitope, preferred peptide variants will differ by only a small number of 

amino acids from the peptides encoded by the DNA sequences disclosed herein. Preferably, such variants will be 
amino acid substitutions of single residues. Substitutional variants are those in which at least one residue in the 
amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally 
are made in accordance with the following Table 7 when it is desired to finely modulate the characteristics of the 

40 protein. Table 7 shows amino acids which may be substituted for an original amino acid in a protein and which 

are regarded as conservative substitutions. As noted, all such peptide variants are tested to confirm that they retain 
the ability to stimulate INF-7 production. 
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20 



microorganism as a vaccine. As described in International Patent Application WO 95/01441. Mycobacterium 
bovis BCG may be employed for this purpose, although this approach would destroy the advantage outlined above 
to be gained from using separate classes of the peptides as vaccines and in the skin test. As disclosed in 
WO 95/01441, an immunostimulatory peptide of M. tuberculosis can be expressed in the BCG bacterium by 
transforming the BCG bacterium with a nucleotide sequence encoding the M. tuberculosis peptide. Thereafter, the 
BCG bacteria can be administered in the same manner as a conventional BCG vaccine. In particular embodiments, 
multiple copies of the M. tuberculosis sequence are transformed into the BCG bacteria to chance the amount of M. 
tuberculosis peptide produced in the vaccine strain. 

IX. USE OF CLONED MYCOBACTERIUMSEQVESCES IN DIAGNOSTIC ASSAYS 

Another aspect of the present invention is a composition for diagnosing tuberculosis infection wherein the 
composition includes peptides encoded by the nucleotide sequences of the present invention. The invention also 
encompasses methods and compositions for detecting the presence of ami-tuberculosis antibodies, tuberculosis 
peptides and tuberculosis nucle.c acid sequences m body samples. Three examples ryp.ry the various techniques 
thai may be used to diagnose tuberculosis infection using the present invention: an in vitro ELISA assay, an in 
15 vivo skin test assay and a nucleic acid amplification assay. 
A. IN VITRO ELISA ASSAY 

One aspect of the invention is an ELISA that detects anti-tuberculosis mycobacterial antibodies in a medical 
specimen. An immunostimulatory peptide encoded by a nucleotide sequence of the present invention is employed 
as an antigen and is preferably bound to a solid matrix such as a crossl inked dextran such as SEPHADEX 
(Pharmacia. Piscataway. NJ). agarose, polystyrene, or the wells of a microliter plate. The polypeptide is admixed 
with the specimen, such as human sputum, and the admixture is incubated for a sufficient time to allow 
antimycobacterial antibodies present in the sample to immunoreact with the polypeptide. The presence of the 
immunopositive immunoreact ion is then determined using an ELISA assay. 

In a preferred embodiment, the solid support to which the polypeptide is attached is the wall of a microtiter 
assay plate. After attachment of the polypeptide, any nonspecific binding sites on the microliter well walls are 
blocked with a protein such as bovine serum albumin. Excess bovine serum albumin is removed by rinsing and the 
medical specimen is admixed with the polypeptide in the microtiicr wells. After a sufficient incubation time, the 
microliter wells are rinsed to remove excess sample and then a solution of a second antibody, capable of detecting 
human antibodies is added to the wells. This second antibody is typically linked to an enzyme such as peroxidase, 
alkaline phosphatase or glucose oxidase. For example, the second antibody may be a peroxidase -labeled goat anti- 
human antibody. After further incubation, excess amounts of the second antibody are removed by rinsing and a 
solution containing a substrate for the enzyme label (such as hydrogen peroxide for the peroxidase enzyme) and a 
color-forming dye precursor, such as o-phenylenediamine is added. The combination of mycobacterium peptide 
(bound to the wall of ihe well), the human antimycobacterial antibodies (from the specimen), the enzyme- 
conjugated ami-human antibody and the color substrate will produce a color than can be read using an instrument 
ihat determines optical density, such as a spectrophotometer. These readings can be compared to a control 
incubated with water in place of the human body sample, or. preferably, a human body sample known to be free of 
antimycobacierial antibodies. Positive readings indicate the presence of anti-mycobacierial antibodies in the 
specimen, which in turn indicate a prior exposure of ihe patient to tuberculosis. 
40 B. SKIN TEST ASSAY 

Alternatively, the presence of tuberculosis antibodies in a patient s body may be detected using an 
improved form of the tuberculin skin test, employing immunostimulatory peptides of the present invention. 
Conventionally, this test produces a positive result to one of the following conditions: the current presence of M 
tuberculosis in the patient's body; past exposure of the patient to W. tuberculosis: and prior BCG vaccination. As 
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tuberculosis. However, exposure to the vaccine peptides aione will not induce an immune response to those 
peptides that are reserved for use in the tuberculin skin test. Thus, the present invention would allow the clinician 
to distinguish between a vaccinated individual and an infected individual. 

Methods for using purified peptides as vaccines are well known in the an and are described in the 
5 following publications: Pal and Horwitz (1992) (reference no. 8) (describing immunization with extra-cellular 

proteins of Mycobacterium tuberculosis); Yang et a). (1991) (reference no. 30) (vaccination with synthetic peptides 
corresponding to the amino acid sequence of a surface glycoprotein from Leishmania major)- Andersen (1994) 
(reference no. 9) (vaccination using shon-cerm culture filtrate containing proteins secreted by Mycobacterium 
tuberculosis); and Jardim et al. (1990) (reference no. 10) (vaccination with synthetic T-cell epitopes derived from 
10 Leishmania parasite). Methods for preparing vaccines which contain immunogenic peptide sequences are also 
disclosed in U.S. Patent Nos. 4,608.251. 4.601.903. 4.599,231. 4,5995230. 4,596,792 and 4,578,770. The 
formulation of peptidc-based vaccines employing M. tuberculosis peptides is also discussed extensively in 
International Patent application WO 95/01441. 

As is well known in the art. adjuvants such as Complete Freund's Adjuvant (CFA) and Incomplete 
15 Freund's Adjuvant (IFA) may be used in formulations of purified peptides as vaccines. Accordingly, one 

embodiment of the present invention is a vaccine comprising one or more immunostimulatory M. tuberculosis 
peptides encoded by genes including a sequence shown in the attached sequence listing, together with a 
pharmaceutical ly acceptable adjuvant. 

Additionally, the vaccines may be formulated using a peptide according to the present invention together 
20 with a pharmaceutical^ acceptable excipient such as water, saline, dextrose and glycerol. The vaccines may also 
include auxiliary substances such as emulsifying agents and pH buffers. 

It will be appreciated by one of skill in the an that vaccines formulated as described above may be 
administered in a number of ways including subcutaneous, intra-muscular and intra-venous injection. Doses of the 
vaccine administered will vary depending on the antigenicity of the particular peptide or peptide 
25 combination employed in the vaccine, and characteristics of the animal or human patient to be vaccinated. While 
the determination of individual doses will be within the skill of the administering physician, it is anticipated that 
doses of between 1 microgram and I milligram will be employed. 

As with many vaccines, the vaccines of the present invention may routinely be administered several times 
over the course of a number of weeks to ensure that an effective immune response is triggered. As described in 
30 International Patent Application WO 95/01441. up to six doses of the vaccine may be administered over a course 
of several weeks, but more typically between one and four doses are administered. Where such multiple doses are 
administered, they will normally be administered at from two to twelve week intervals, more usually from three to 
five week intervals. Periodic boosters at intervals of 1-5 years, usually three years, will be desirable to maintain 
the desired levels of protective immunity. 
35 as described in WO 95/01441. the course of the immunization may be followed by in vitro proliferation 

assays of PBL (peripheral blood lymphocytes) co-cultured with ESAT6 or ST-CF, and especially by measuring the 
levels of IFN-7 released from the primed lymphocytes. The assays are well known and arc widely described in the 
literature, including in U.S. Patent Nos. 3.791,932: 4.174.384 and 3,949.064. 

To ensure an effective immune response against tuberculosis infection, vaccines according to the present 
40 invention may be formulated with more than one immunostimulatory peptide encoded by the nucleotide sequences 
disclosed herein. In such cases, the amount of each purified peptide incorporated into the vaccipe will be adjusted 
accordingly. 

Alternatively, multiple immunostimulatory peptides may also be administered by expressing the nucleic 
acids encoding the peptides in a nonpathogenic microorganism, and using this transformed nonpathogenic 
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methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for 
use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (1988). 
B. ANTIBODIES RAISED AGAINST SYNTHETIC PEPTIDES. 

An alternative approach to raising antibodies against the M. tuberculosis peptides is to use synthetic 
peptides synthesized on a commercially available peptide synthesizer based upon the amino acid sequence of the 
peptides predicted from nucleotide sequence data. 

In a preferred embodiment of the present invention, monoclonal antibodies that recognize a specific M. 
tuberculosis peptide are produced. Optimally, monoclonal antibodies will be specific to each peptide, i.e. such 
antibodies recognize and bind one W. tuberculosis peptide and do not substantially recognize or bind to other 
proteins, including those found in healthy human cells. 

The determination that an antibody specifically detects a particular Af. tuberculosis peptide is made by any 
one of a number of standard immunoassay methods: for instance, the Western blotting technique (Sambrook et al.. 
1989). To determine thai a given antibody preparation (such as one produced in a mouse; specifically detects one 
Af tuberculosis peptide by Western blotting, total cellular protein is extracted from a sample of human sputum 
from a healthy patient and from sputum from a patient suffering from tuberculosis. As a positive control, total 
cellular protein is also extracted from M tuberculosis cells grown in vitro. These protein preparations are then 
electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel. Thereafter, the proteins are transferred to a 
membrane (for example, nitrocellulose) by Western blotting, and the antibody preparation is incubated with the 
membrane. After washing the membrane to remove non-specifically bound antibodies, the presence of specifically 
bound antibodies is detected by the use of an anti-mouse antibody conjugated to an enzyme such as alkaline 
phosphatase; application of the substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the 
production of a dense blue compound by immune-localized alkaline phosphatase. Antibodies which specifically 
detect the A/, tuberculosis protein will, by this technique, be shown to bind to the M. tuberculosis -extracted sample 
at a particular protein band (which will be localized at a given position on the gel determined by its molecular 
weight) and to the proteins extracted from the sputum from the tuberculosis patient. No significant binding will be 
detected 10 proteins from the healthy patient sputum. Non-specific binding of the antibody to other proteins may 
occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be 
recognized by one skilled in the art by ihe weak signal obtained on (he Western blot relative to the strong primary 
signal arising from the specific antibody-tuberculosis protein binding. Preferably, no antibody would be found 10 
30 bind to proteins extracted from healthy donor sputum. 

Antibodies that specifically recognize a M. tuberculosis peptide encoded by the nucleotide sequences 
disclosed herein are useful in diagnosing the presence of tuberculosis antigens in patients. 

All publications and published patent documents cited in this specification are incorporated herein by 
reference to the same extent as if each individual publication or patent application was specifically and individually 
35 indicated to be incorporated by reference. 
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noted above, if one group of inununostimulatory peptides is reserved for use in vaccine preparations, and another 
group reserved for use in the improved skin test, then the skin test will not produce a positive response in 
individuals whose only exposure to tuberculosis antigens was via the vaccine. Accordingly, the improved skin test 
would be able to properly distinguish between infected individuals and vaccinated individuals. 
5 The tuberculin skin test consists of an injection of proteins from M. tuberculosis that are injected 

intradermally. The test is described in detail in Cecil Textbook of Medicine (Ref. 33). pages 1733-1742. If the 
subject has reactive T-cells to the injected protein; the cells will migrate to the site of injection and cause a local 
inflammation. This inflammation, which is generally known as delayed type hypersensitivity (DTH) is indicative 
of M. tuberculosis antibodies in the patient's blood stream. Purified immunostimulatory peptides according to the 
10 present invention may be employed in the tuberculin skin test using the methods described in reference 33. 
C. NUCLEIC ACID AMPLIFICATION 

One aspect of the invention includes nucleic acid primers and probes derived from the sequences set forth 
in the attached sequence listing, as well as primers and probes derived from the full length genes that can be 
obtained using these sequences. These primers and probes can be used to detect the presence of M. tuberculosis 

15 nucleic acids in body samples and thus to diagnose infection. Methods for making primers and probes based on 
these sequences are well known and are described in section V above. 

The detection of specific pathogen nucleic acid sequences in human body samples by polymerase chain 
reaction amplification (PCR) is discussed in detail in reference 17. in particular, pan four of that reference. To 
detect M. tuberculosis sequences, primers based on the sequences disclosed herein would be synthesized, such that 

20 PCR amplification of a sample containing M. tuberculosis DNA would result in an amplified fragment of a 

predicted size. If necessary, the presence of this fragment following amplification of the sample nucleic acid could 
be detected by dot blot analysis (see chapter 48 of reference 17). PCR amplification employing primers based on 
the sequences disclosed herein may also be employed to quantify the amounts of A*, tuberculosis nucleic acid 
present in a particular sample (see chapters 8 and 9 of reference 17). Reverse-transcription PCR using these 

25 primers may also be utilized to detect the presence of M. tuberculosis RNA. indicative of an active infection. 

Alternatively, probes based on the nucleic acid sequences described herein may be labelled with suitable 
labels (such a P y - or biotin) and used in hybridization assays to detect the presence of M. tuberculosis nucleic acid 
in provided samples. 

X. USE OF CLONED MYCOBACTERIUM SEQUENCES TO RAISE ANTIBODIES 

30 Monoclonal antibodies may be produced to the purified M. tuberculosis peptides for diagnostic purposes. 

Substantially pure M. tuberculosis peptide suitable for use as an immunogen is isolated from the transfected or 
transformed cells as described above. The concentration of protein in the final preparation is adjusted, for 
example, by concentration on an Amicon filter device, to the level of a few milligrams per milliliter. Monoclonal 
antibody to the protein can then be prepared as follows: 

35 A. MONOCLONAL ANTIBODY PRODUCTION BY HYBRIDOMA FUSION. 

Monoclonal antibodv to epitopes of the M. tuberculosis peptides identified and isolated as described can be 
prepared from murine hybridomas according to the classical method of Kohler and Milsiein (1975) or derivative 
methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected purified protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody -producing cells of the spleen 

40 isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess 
unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The 
successfully fused cells are diluted and aliquots of the dilution placed in wells of a microliter plate where growth of 
the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid 
of the wells by immunoassay procedures, such as ELISA. as originally described by Engvall (1980). and derivative 
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(A) LENGTH: 265 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doubl e 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciISl-62 
10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO I 

ACGCGGACCT CG AAGTT CAT CATCGAGTGA TACGTGCCAC ACATCTCGGC 50 
GCAGTGGCCC ACGAATGCAN CCGGTCTTGG TGATTTCNTC GATCTGGAAG 100 
ACGTTGACCG ARTTGTTTGC CACCGGGTTA GGCATCACGT CACGCTTGAA 150 
CAAGAACTCC GGCACCCAGA ATGCGTGTGT CACATCGGCT GAGGCCATTT 2 00 
GGAATTCGAT ACGCTTGCCG GACGGCAGCA CCAGCACCGG AATTTCGGTG 250 
CTGTGCAACG TCTCG 265 
(2) INFORMATION FOR SEQ ID NO : 2 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 84 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: • double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

:5 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#l-152 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 2 

CTGGTACGAC GCCGGCAAGG ACTACGGACG AGGTGGCACA GAATTCAATG 50 

CGGCGCTCAT CGGAACCGAC GTGCCCGACG NCGTTTGCTC GACGACGATG 100 

GTGNTTCCAN TTCGCCTNAN CGGTGTNCTG ACTGCCNTTG ACGACCTGNT 150 

CGGCCARGTT GGGNTGGACA CAACGGATTA CGTCGATTCG CTGCTGGCCG 200 

ACTATGAGTT CAACGGCCGC CATTACGCTG TGCCGTATGC TCGCTCGACG 250 

CCGCTGTTCT ACTACAACAA GGCGGCGTGG CAACAGGCCG GCCTACCCGA 3 00 

35 CCGCGGACCG CAATCCTGGT CAGAGTTCGA CGAGTGGGGT CCGGAGTTAC 3 50 

AGCGCGTGGT CGNCGCCGGT CGATCGGCGC ACGGCTGCGT AACGCCGACC 4 00 

TCATCTCGTG GACGTTTCAG GGACCGAACT GGG C ATNCGG CGGTGCCTAC 4 50 

TCCGACAAGT GGACATTGAC ATTGACCGAG CCCG 484 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
<;i) APPLICANTS: UNIVERSITY OF VICTORIA INNOVATION AND 

5 DEVELOPMENT CORPORATION 

(ii) TITLE OF INVENTION: MYCOBACTERIUM TUBERCULOSIS DNA 
SEQUENCES ENCODING I MMUNOST I MULATORY PEPTIDES 

(iii) NUMBER OF SEQUENCES : 75 
(iv; CORRESPONDENCE ADDRESS: 

to (A) ADDRESSEE: Klarcuisc Sparkman Campbell Leigh 

St Whinston, LLP 
( 3 ) STREET: One World Trade Center, Sui:e 1600, 
121 S.W. Salmon Street 

(C) CITY: Portland 
15 (D) STATE: OR 

(E) COUNTRY: USA 

(F) ZIP: 37204-2988 
(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Disk, 3.5-inch 
20 (3) COMPUTER: IBM PC compatible 

CO OPERATING SYSTEM: MS DOS 

(D) SOFTWARE: WordPerfect 5.1-r, ASCII 
(vii CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US96 / 10375 
25 (B) FILING DATE: June 14, 1996 

(C) CLASSIFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 06/000,254 

(B) FILING DATE: 06/15/95 
30 (viii) ATTORNEY /AGENT INFORMATION 

(A) NAME: Richard J . Polley 

(B) REGISTRATION NUMBER: 23,107 

(C) REFERENCE /DOCKET NUMBER: 284 7 -45176 /RJP 
(ix) TELECOMMUNICATION INFORMATION: 

35 (A) TELEPHONE: (503) 226-7391 

(B) TELEFAX: (503) 228-9446 
(2) INFORMATION FOR SEQ ID NO: 1 
( i ) SEQUENCE CHARACTERISTICS : 
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TGCCGACGTT GCCCTCGCCG ACGTTCGCCA AGCCCAGGTT GCGGACACGC 200 
CGGTGATTGT GCGTGGGGCA ATGACGGGCT GCTGGCCCGG CCGAATTCCA 250 
AGGCGTCGAT CGGCACGGTG TTCCAGGACC GGGCCGCTCG CTACGGTGAC 3 00 
CGAGTCTTCC TGAAATTCGG CGATCAGCAG CTGACCTACC GCGACCGTAA 3 50 
5 CGCCACCGCC AACCGGTNNG CCGCGGTGTT GGCCNNNCGC. GGCGTCGGCC 4 00 
CCGGCGACGT CGTTGGCATC ATGTTGCGTA ACTCACCCAG CACAGTCTTG 4 50 
GCGATGCTGG CCACGGTCAA GTGCGGCGTA TCGCCGGCAT GCTCAACTAC 500 
CACCAGCGCG 51Q 
(2) INFORMATION FOR SEQ ID NO: 5 
10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 56 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 f *ii> MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil# 1-4 26 
20 <*i) SEQUENCE DESCRIPTION: SEQ ID NO 5 





GCAACGGAGA 


GGTGGACTAT 


GCCGGACCGG 


CACCGCGAAG 


GGGTTGGTGC 


50 




CGGCCCGGGT 


GGTGACGGTG 


CACATTCTGC 


GCAATTCGCT 


GAGTTCCGGT 


100 




GGTGACCTTC 


CTGGGCGCGG 


AGTCTGGGCG 


CGCTGATGGC 


GGAGCGAKTG 


150 




TGACCGAAGG 


AANTCNGTTC 


AACATCCACG 


GCGTCGGGGG 


CGTGCTGTAT 


200 


25 


CAAGCGGTCA 


CCGTCAGGAG 


ACGCCGACGG 


TGGTGTCGAT 


CGTGACGGTG 


250 




CTGGTGCTGA 


TCTACCTGAT 


CACCAATCTG 


TTGGTGGATC 


TGCTGTATGC 


300 




GGCCCTGGAC 


GCCGNNGATN 


CGCTATGGCT 


GAGCA CACGG 


GGTTCTGGCT 


350 




CGATGCCTNG 


CGCGGGTTGC 


GCCGGCGTCC 


TAAANTCGTG 


ATCGCGCGGC 


400 




GCTGAKCCTG 


CTGATTCTTG 


TCGTGGCGGC 


GTTTCCGTCG 


TTGTTTACCG 


450 


30 


CAGCCG 










456 



(2) INFORMATION FOR SEQ ID NO: 6 ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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(2) INFORMATION FOR SEQ ID NO: 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 513 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
10 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#l-23 9 
•Xl) SEQUENCE DESCRIPTION: SEQ ID NO 2 

GGCGGCCAGA CGTCGGAACT CGCGGCCAAT TGGTGTGGTG GGAACCGCGA 50 
TCCTCGACGC AACCGCTTCG CGGTCTTGGC AGTGTTCGAT GCCAATCTGC 100 

15 CGGCCGGGAC GCTGCCGGAT GCGGCCCGTT CACCGAGGCT GGTGACAAGA " 150 
CCTGGCGTTG TCGTTCCGGG CACTACTCCC NAGGTCGGTC AAGGCACCGT 2 00 
CAAAGTGTTC AGGTATACCG TCGAGATCGA GAACGGTCTT GATCCCACAA 250 
TGTACGGCGG TGACAANNNN ATT CGCCC AG ATGGTCGACC AGACGTTGAC 3 00 
CAATCCCAAG GGCTGGACCC ACAATCCGCA ATTCGGCGTT CGTGCGGATC 350 

20 GACAGCGGAA AACCCGACTT CCGGATTTCG CTGGTGTCGC CG AC GACAGT 4 00 
GCGCGGGGGN TGTGGCTACG AATTCCGGCT CGAGACGTCC TGCTACAACC 4 50 
CGTCGTTCGG CGGCATGGAT CGCCAATCGC GGGTGTTCAT CAACGAGGCG 500 
CGCTGGGTAC GCG 513 
(2) INFORMATION FOR SEQ ID NO: 4 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacter ium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#l-247 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 

GTGTGCAACC AGTGTGTGTN CGTGTGCGAA CCAGTGTGTA GTGGTAACCA 50 
GGACCACGTT GCAAACCAGT GTTGGAGTGC AGTGTTGCGT GCNAGTGTTG 100 
CNCGTTGCAG TGTTNGNCGA GCCGAGATTG G AAGTTNC CG ACATTACCGT 150 



SUBSTITUTE SHEET (RULE 26) 



WO 97/00067 PCT7US96/1 0375 



- 36 - 



15 



GTTCGNCGCG CTCAAAAGGT TGACGATGGT CACGTCGCAC GTGCTGGCCG 50 
AGACCAAGGT GGATTTCGGT GAAGACCTCA AAGANCTCTA CTCGNATCGT 100 
CAAGGCCCTC AACGACGACC GAAAGGATTT CGTCACCTCG CTGCAGCTGT 150 
TGCTGACGTT CCCATTTCCC AAC 173 
5 (2) INFORMATION FOR SEQ ID NO: 9 

(i) SEQUENCE CHARACTERISE CS : 

(A) LENGTH: 223 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
\vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobaccer ium cuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciIS2-35 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 9 

CCTGTTNCAA CGGTNCNTTC NCGGAACGGA CGACTTCTGA TNCGNNCTCG 50 
GNCGTTCCCT CGCACCGGTC GATGGTGATC AAGGTCAGCG TCTTCGCGGT 100 
GGTCATGCTG CTGGTGGCCG CCGGTCTGGT GGTGGTATTC GGGGACTTCC 150 
GGTTTGGTCC CACAACCGTC TACCACGCCA CCTTCACCGA CNCGTNGCGG 200 
CTGAANGCAG GCCAGAAGGT TCG 223 
(2) INFORMATION FOR SEQ ID NO: 10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 
:5 (B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI82-272 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 10 

CAACGAGATC GCACCCGTGA TTAGGAGGTG ACGGTGGCAG CGCCGACCCC 50 
GTCGAATCGG ATCGAAGTAA CGCTCCGTAG ACGCCAGCTC GTCCGCGCCG 100 
ATGCCGACCT GCCACCCGTG ^ 120 

. (2) INFORMATION FOR SEQ ID NO: 11 
(i) SEQUENCE CHARACTERISTICS: 



20 



35 



SUBSTITUTE SHEET (RULE 26) 



WO 97/00067 



PCT/US96/10375 



- 35 



(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-2 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 .. 

5 TCNCTTANYC CTTCANCTGN CATCTNTCCC AANNACCGAA "NTCTGGACCT 50 
ATSACGNCCA NCTNAANATG NCCCNCGACN AAGGNCNTTG NACGTTCNCT 100 
GKACCACCAN CGGGTTGCAT SCCAAGCTAG NCGAACATCA NASGTTNCGC 150 
GCNTACGAGC CGACCCGCCG CGGCG 175 
(2) INFORMATION FOR SEQ ID NO: 7 
10 (i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 231 
! B ) TYPE: nucleic acid 
(C: STRANDEDNESS : double 
(D) TOPOLOGY: linear 
15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacte r ium nuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciIS2-23 
20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 7 

CTTCTCGCGC CAGCCGTCCC GCTGTCCGGG ATGCGCTACC GGTCGTCAGC 50 
GCCAAGACGG TGCAGCTCAA CGACGGCGGG TTGGTGCGCA CGGTGCACTT 100 
GCCGGCCCCC AATGTSGCGG GGCTGCTGAG TGCGGCCGCG TGCCGCTGTT 150 
GCAAANNGCG ACCACGTGGT GCCCGCCGCG ACGGCCCCGA TCGTCGAAGG 200 
25 CATGCAGATC CAGGTGACCC G CAAATCGG A T 231 
(2) INFORMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-26 
(xi) • SEQUENCE DESCRIPTION: SEQ ID NO 3 
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(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-511 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 13 

GCGNACNCTG CGCATNGCTG CCNGTANCCC GGCGCCNAGG CATGAGNCNN 50 
TAGGCCGAAA TGCCTGGTKA ANCTNGCGTG TSGTGGTTGA .CCCGCNGCGT 100 
SCNGGCNTAC AKGTGCATGC TGTNGATCGG CAGTGGGAGA GGTGAGCGGT 150 
GCGGCGTNAA GGTGCGGAGG TTNGASNTCT GGCGGTGTCG GCGTTNGGTG 2 00 
GCTTTGTTCC CGGCGGTCGC GGGGTGCTCC NGNATTCCGG CGACNAACNA 250 
AANNCCGGGN AGSACGAYNC CCGTCGACAC CNGGCAAACG CTGAGGGCCG 3 00 
GCACGGACCC TTCTTCCCGC AATGTGGCGG CGTCAGCGAT CANGACGGTG 3 50 
ACCGAGCTGW ACAAGGGTGA CCGGGCTGGT CAACACCGCC AAGAAGTCGG 4 00 
TGGGCTNCCA ATGGCNTGGC G 421 
(2) INFORMATION FOR SEQ ID NO: 14 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 17 5 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciIS2-523 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 14 

25 CCAGNCCNCC NAACNTGTYN CGNTCTCAYY TCGCCGTCGC TGCCGGTNCG 50 
TGTGTGCACC ATCTGCACCG ACCCGTGKAA CYTCGATCAC GANACTGGNA 100 
GAGNTCAGGC ATNAAAGCCG GAGTGGCACA GCAACGGTCG CTACTGGAAT 150 
TGGCGAAGCT GGATGCTGAG CTGAC 175 
(2) INFORMATION FOR SEQ ID NO: 15 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 ^i) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
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(A) LENGTH: 16 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-506 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 11 

CNGGCNNCCA NCGGGTGCGC CAWGCACGGC CGGTCCGTGC GAGATCGTCN 50 
CNAATGGCAN GCCGGCGCCC AAKANANNNC CGGTACCGTG CCTTCGTNGW 100 
GCAWCCTNGC GACCAACCCC GAGATYGCYA CNCTACNGCC GGKACATGAC 150 
CGTGGTGCGG 160 

15 (2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE.: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
:5 (D) OTHER INFORMATION: AciI#2-508 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 12 

GACTGGNCCC GAYGYTGTGN CCGGHNCGTH GGNCGHGCHG CANTCGAYCC 50 
TGGCCGTTGC TTCGGTGCCG GGTTGTTCAT CGCCTTCGAC CAGTTGTGGC 10 0 
GCTGGAACAG CATAGTGGCG CTAGTGCTAT CGG 133 
30 (2) INFORMATION FOR SEQ ID NO : 13' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) • ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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CCGGGATTGG CACGGTCGGC AABGCATTCG TCAGCNNTGC GCTCGAAGGT 150 
CAACAAGAAT GTCGGGGTCT ACGCGGTGAA A 181 
(2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 9 5 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: genomic DNA 
io (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciIS2-372 
1X1 ) SEQUENCE DESCRIPTION : SEQ ID NO 13 

15 AGGTKACGGT GGCAGGGGGG ACCCCGTCGA ATCGGWTCGA AGAAYGCTCC 5 0 
GKACACGCCA GCTGCGTCCG YGCCGATGCC GACCTGCCAC GCGTG 9 5 

(2) INFORMATION FOR SEQ ID NO : 19 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 5 
~o (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

-5 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-884d 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 19 

AKCGGTCACC KACGGGCCGG CCACCGATGC GATTGTCAAC GGATTCCAAG 50 
30 TGGTTGYGCA TGCGC 6 5 

(2) INFORMATION FOR SEQ ID NO: 20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 156 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

r' 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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(D) OTHER INFORMATION: AciI#2-639 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 15 

GGGCTGGATT CGAGGCTCGT GCATGNCGTA CGACTANGGG TAGCGCCCAG 50 
CTGCTCAATA CCATCGGTTG GATAACAAAG GCTGAACATG. AATGGCNTGA 100 
5 TCTCNACAAG CGTGCGGCTC CCACCGAC CC CGGCGCCCCT* CGAGCCTGGG 150 
GSTGTCGCGA TCCTGATCGC GGCGACACTT TTCGCGACTG TCGTTGCGGG 200 
GTGCGGGAAA AAACCGACCA CGGCGAGCTC CCGAGTCCCG GGTCGCCGTC 250 
GCCGGAAGCC CAC 263 
(2) INFORMATION FOR SEQ ID NO: 16 
10 ( i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 168 
■B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
(Di TOPOLOGY: linear 
15 iii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-322 
:o (xi) SEQUENCE DESCRIPTION: SEQ ID NO 16 

YGCCATGCGA AGCGCACCCC GGTCCGGAAG NCCTGCACAG TTCWNCCGTG 50 
CTCGCCGCGA CGCTACTCCT CGNYTGCGGC GGTCCCAYGC AGCCAYGCAG 100 
CATCACCTTG ACCTTTATCC GCAACGYGYA ATYCCAGGCC AAYGCCGAYG 150 
GGATCATCGA YACCKACA 163 
25 (2) INFORMATION FOR SEQ ID NO: 17 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
30 (D) TOPOLOGY: linear 

iii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobaccer ium tuberculosis 
(ix) FEATURE: 
35 (D) OTHER INFORMATION: AciI#2-854 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 17 

ACCNGTTCCC GCCGGNCTNA CNCNCGGTGC CGTTGCACCG GCCANCTGCA 50 
GCCTGCCCCG ACGCCGAAGT GGTGTTCGCN CCGCGGCCGC TTCGAACCGC 100 
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GGANCTCGAC ATCAATKCAN CCGGAGNAGN ANGCTGACCN AACATNCGCT 200 
CATCGACCGC GGATGTCNAT CGAGNACGST GCCAAGSCGC TGCAGCTGGA 250 
TNCTCGAGCG CGCCATGGAG TNATRTGCGS CCGACGAATN CGTCGAGGTG 3 00 
ACCCCGGAGA NTCGTGCGGA TSCGCRAAGT CGAGCTGGCC GGCCNGCCGC 3 50 
5 CCGGGCTNMG CAGCCGGGCG CGCACCNAAG GCGCGTGGCN'.'TAGCANACTT 4 00 
GGCGNGCTGG CCGCGCGAGC GTANACNGCC ACTGCGAAAN TCCANGCCCG 450 
GCTTTTCGCA GCCGGGTTNA CGCTCGTGGG GGTACTGGAT AGCCTGATGG 500 
GCGTGCCCAG NCCCANGTCC GCCGCGTCTG TGTGACGGTC GGCGCGTTGG 550 
TCGCGCTGGC GTGTATGGTG TTGGCCGGGT GCACGGTCAG CCCGCCGCCG 600 
10 GCACCCCAGA GCASTGATAC GCCGCGCAGC ACACCG $36 
(2) INFORMATION FOR SEQ ID NO: 23 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
20 ( ix ) FEATURE : 

CD) OTHER INFORMATION: Acil#2-916 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 23 

CTTCCGGCGG GACAACAACA GGTCTCACCG GCGCCACACC CTGACACCTG 50 
ATCGCGTCTG CCGATCCCGG TCGGAGCACC CGGGTTCCAC CGCTGTGCCC 100 
25 CCC 

(2) INFORMATION FOR SEQ ID NO: 2 4 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 07 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: AciIS2-1014 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 24 
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(A) ORGANISM: Mycobac cerium Tuberculosis 
(ixj FEATURE: 

(D) OTHER INFORMATION: Acil#2-884l 
(xi; ■ SEQUENCE DESCRIPTION: SZQ ID NO 20.. 
5 TCTTCTACAA GGACGCCTTC GCCAAGCACC AGGAGCTGTT " CGACGACTTG 50 
GNCGTCAACG TCAACAATGG CTTGTCCGAT CTGTACRAGC AAGWTCGAGT 100 
CGCTGCCGNB CGCAACGCGA CGAGATCATC GAGGACCTAC ACCGTTGCCA 150 
CGAACA 156 
(2) INFORMATION FOR SZQ ID NO: 21 
10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 3 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 ( ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-894l 
20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 21 

ATNCCGTTCC ACTNCCGCGG CAGCAGCTGG NTTTGCGCAC ACGGTGACCC 50 
AGTGGCGNTT GGTGGGGCCT CGCTGACGGC GAGTNTGGNC GAGCGTCCTC 100 
GGTCGGTGNC CTNTCNTCCC GCC 123 
(2) INFORMATION FOR SEQ ID NO: 2 2 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 6 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-898 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 22 

CGGTCWHKCA ANTTGATGBC NGCGCGCAAG GCCGNCATGG TNGAGATGCC 50 
AACCACACCA CCGGCTGGNT CCGCATGGAC TTCGTGNTTS CCAGTCGCNG 100 
CCTGATTGGG TGNCGCACCG ACNNCCTNCA CCGAGACCSG TGGCTCNSGA 150 
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CGGTCCTGGC TCTCGGGCTG CTGGCCNCTG CGCCCCACCC CGCACCGGGC 200 
CGGCTTC 

207 

(2) INFORMATION FOR SEQ ID NO: 2 7 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 89 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A} ORGANISM: Mycobac cerium tuberculosis 
(ix) FEATURE: 

!D) OTHER INFORMATION: AciI22-1084 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 27 

YCNAGNCKCG TNATNGCSGN CKCATNTNAC NGGANCCNGG ATTNCSTACG 50 
CCACNGTGAT CGCGCTGGTN GCCGCGCTGG TGGCGCGTGT ACGTGCTCTC 100 
GTCCACCGGN AANTAAGCGC ACCATCGTGG GCTACTTCAC CTCTGCTGTC 150 
GGGCTCTATC CCGGTGACCA GGTCCGCGTC CTGGGCGTCC NGGTGGGTGA 200 
GATCGACATG ATCGAGCCGC GGTCGTCCGA CGTSAAGATC ACTATGTCGG 250 
20 TGTCCAAGGA CGTCAAGGTG CCCGTGSACG NTGCAGGCC 28 9 

(2) INFORMATION FOR SEQ ID NO: 2 8 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 198 

(B) TYPE: nucleic acid 

25 (O STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
30 (ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1089 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 28 

TTGNACCANG CCTATCGCAA GCCAATCACC TATGACACGC TGTGGCAGGC 50 
TGACACCGAT CCGCTGCCAG TCGTCTTCCC CATTGTG CAA GGTGAACTGA 100 
35 'GCAANGCAGA CCGGACAACA GGTATCGATA GCGCCGAATG CCGGCTTGGA 150 
CCCGGTGAAT TATCAGAACT TYGCAGTCAC GAACGACGGG GTGATTTT 198 

(2) INFORMATION FOR SEQ ID NO: 29 
(i) SEQUENCE CHARACTERISTICS : 
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GCCACCGGTT CATCGCGTGG TGCTGGTCAC CGCCNGGAAN GCCTCAGCGG 50 
ATCCCCTGCT GCCACCGCCG CCTATCCCTG CCCCAGTCTC GGCGCCGGCA 100 
ACAGTCCCGY CCGTGCAGAA CCTCACGGCT NCTHCCGGGC GGGAGCAGCA 150 
ACAGGTTCTC ACCGSYGCCri NGYACCCGCA CCGATCGCGT CGCCGATTCC 200 
5 GGTCGGA 2 07 

(2) INFORMATION FOR SEQ ID NO: 2 5 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 04 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS : double 

{ D) TOPOLOGY: linear 
in) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
15 ( ix) FEATURE: 

(D) OTHER INFORMATION: AciI82-1025 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 25 

TTNCGCANNC GTTCATCCAG GTCCACTGGT GTCGCANCTC TCNNTGATGC 50 
ACCGGTTCCG GATATATGTC NACATCNCCS TCSTCGTCCT GGTGCTGGTA 10 0 
20 CTNACGAACC TGATCGCGCA TTTCACCACA CCGTGNGCGA GCATCGCCAC 150 
CGTCCCGGCC GCCYGCGGTC GGACTGGTGA TCTTGGTKCG GAGTAGAGGC 200 
CTGG 2 04 

(2) INFORMATION FOR SEQ ID NO: 26 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 2 07 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1035 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 26 

35 ATACCNGTCA TCCNGCACAT NGTCAACCTN GAGTCGGTNC TCACCTACGA 50 
GGCACGCCCG AGATG CATCA CTGGTGCTCG RTCAGNCCTT CACGGCTTGG 100 
CCGCCTTCCG GTAGGACCGT HGCATGCCCG TCTTCGGCGC CTCGGGTGTT 150 
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(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#3-9 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 1 

5 CAGNCCGCTG NCCCGGAACT GTTCCAGCAG CTACAAGAC C ""TTCGACAACG 50 
TNGCGCGTCA ACCTGCANTC GAGCGCAACC TCTCGGTGGC GCTCAACGAG 100 
TGTTCGCCGG CTTCAACCCG CTGGACCCGC GAAACCTCGA CGTGTCCCCG 150 
CTGCCTTCGC TGGCCAAGCG CGCCGCCGAC ATCCTGCGCC AGGACGTGGG 200 
CGGGCAGGTC GACATTTTCG ATGTCAATGT GCCCACCATC CAGTACGACC 250 

10 AGAGC _ 

255 

(2) INFORMATION FOR SEQ ID NO: 3 2 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 164 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
20 (ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-12 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 2 

AAYNCCNGGC CRTCGACGGT NCCGGTTCNC RCCACCGGTC TATATCCACC 50 
CGGGTCNRCA TTMANANTGA NTMNCCGCCG GTGCGGCCGT CGAGCGTGAC 100 
25 CTGGCATCCC CTGAGACGCT GCTGGGTTGC CCCGGGGAGN TCGAMANTCG 150 
GGCATCGCAC CATC 164 
(2) INFORMATION FOR SEQ ID NO: 3 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 237 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

35 < A > ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: Acilif3-15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 3 
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(A) LENGTH: 14 9 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 
5 -ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1090 
10 (xi} SEQUENCE DESCRIPTION: SEQ ID NO 2 9 

TCACGANGGT RYNACMGCAA CWCGACCGCC ACGTCASGCC GCCGCGCACG 50 
AAGATCACCG TGCCTGCNCG ATGGGTCGTG AACGGAATAG AAYGCAGCGG 100 
TGAGGTCAAN YGCGAAGCCG GGAACCAAAT CCGGTGACCG CGTCGGCAT 14 9 
(2) INFORMATION FOR SEQ ID NO: 3 0 
15 ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#2-1104 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 0 

GGACCCGCCA AG CATC AG CC GGTCAACAGC CGCCGCCGGT GGCCAAAGTT 50 
CGAGCAGCCG CCGGTATCGT GCTCGGCCCG GCTAGACCAA AAACTTTACG 100 
CCAGCGCCCG AAGCCACCCG ACTCCAAGGC CTCGGCCCGG TTGGGTTCGC 150 
ACATGGGTGA GTTCTATATG CCCTACCCGG GCACCCGGTT CAACCAGGAA 2 00 
30 ACCGTCTCGC - 210 

(2) INFORMATION FOR SEQ ID NO : 31 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 255 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) . MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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GAGAACTCCG GGCCGANTTT TGGACA 26 
(2) INFORMATION FOR SEQ ID NO: 3 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 204 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Myccoacter ium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-133 
;xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 6 

TGTCGGGTNA RNGTTCGCGT CCATGATTGC TCTTGCAACG CTGTTGACGC 50 
TTATCAATCA AGTCGTCGGC ACT C CGT AT A TTCCCGGTGG CGATTCTCCC 100 
GCCGGGACCG ACTGCTCGGA GCTGGCTTCG TGGGTATCGA ATGCGGCGAC 150 
GGCCAGGCCG GTTTTCGGAG ATAGGTTCAA CACCGGCAAC GAGGAAGCGC 200 
CTTG 204 
(2) INFORMATION FOR SEQ ID NO: 3 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 312 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacter ium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-134 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 7 

CANNTTAGAC TGTCGTGACA TATCNCNNTN TACNCNTGGN ACGGCCATNA 50 
TTGGATAATN CGTGATAANC ACCACAAGAA TNATTCCTAT GNATATTGTC 100 
GGTACGTTCG CGNCCATGAT TNGCTCTTGC AACGCTGTTG ACG'CTTATCA 150 
ATCAAGTCGT CGNCACTCCG TATATTCCCG GTGNCGATTC TCCCGCCGGG 200 
35 ACCGACTGCT CRGAGCTGGC TTCGTGGGTA TCGAATGCGS CGACGSCCAG 250 
GCCGGTTTTC GSAGATAGGT TCAACACCGG CAACGAGGAA GCGCCTTGGC 3 00 
GGCTCGGGGC TN 312 
(2) INFORMATION FOR SEQ ID NO: 3 8 



30 
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ACGGACGGCA ACGGGATGCG ACCCGATCCC ACCGGTCGCC ACGAGGGACG 50 
CTACTTCGTC GCCGGGCAGC CGANCCGACC GTCNGTTCNG CGANGGCGAC 100 
NGCCGAAGCC GTTGACCCAC NTTGGTCAGC AGCAGCTGGA TSAGTCAGGT 150 
GCCGTTGGTG TTTCGCCG7C AGCGGTGTCG GGGTGGGTGC GTTCTGGGCA 200 
5 CCGTCGACTG TGGTGGGCGC TNGCGGGCGN TGGTGGC 237 

(2) INFORMATION FOR SEQ ID NO: 34 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 74 

(3) TYPE: nucleic acid 

10 (C) STRANDEDNESS : double 

(D» TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
15 (ix) FEATURE: 



(D) OTHER INFORMATION: AciI£3-4 7 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO 34 






CNGATNGCTC 


GGNCTNCGGT 


ACCNAACTCG 


NAACTCGCGC 


CCWYGCGNAC 


50 


GCAGGNCCGC 


GGTTGGCACC 


ACCAGCGACA 


TCAATCANGC 


AGGWKNCCCG 


100 


CCACGTTGCA 


AGACGGCGGC 


AATCTTCG CC 


TGTCGCTCAC 


CGACTTTCCG 


150 


CCCAACTTCA 


ACATCTTGCA 


CATCGACGGC 


AACAACGCCG 


AGGTCGCGGC 


200 


GATGATGAAA 


GCCACCTTGC 


CGCGCGCGTT 


CATCATCGGA 


CCGGACGGCT 


250 


CGNACGNACG 


GTCGACACCA 


ACTACTTCAC 


C AG C .AT C G AG 


CTGACCAGGA 


300 


CCGCCCCGCA 


GGTGGTCACC 


TACACCATCA 


ATC CCG AGGC 


GGTGTGGTCC 


350 


GACGGGACCC 


CGATCACCTG 


GCCG 






374 



(2) INFORMATION FOR SEQ ID NO: 3 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

CD) OTHER INFORMATION: AciI#3-78 (overlaps .with Acil#3- 
167) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 3 5 
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CGANCCCGAB ACAANTGATT 50 
NCGATGCGCC NGGCGGCCCG 100 
CTCGCKTCAT CCGACGCCAG 150 
GCAATCGTCG TATTGGGCTG 2 00 
TNGTCGSGCA -GTACACCTTG 250 
TATCCGACGG CCAATGTGAC 3 00 
TGCCGTCGAG SCCACCGACC 3 50 
CAGNCAACTA SAAAATCSCC 4 00 
GTCAGCGGTN GGCGAGCAGT 4 50 
GTNAAATACT TCTCCTCCGG 500 
TGAGATCGGG CCGGCGCTGG 550 
MGCCCACGGA GAAGATCGGC 6 00 
GGTGGGCTGG GACCCGCGNN 6 50 
ATCGTCGGTG ACTTCAAAAC 700 
GAACTCCGGG CCGATTTTGG 7 50 
ACGCTGGGCG CGCAAATTGA 800 
GAKCAGAACG TG GGAAG CAT 8 50 

853 

(2) INFORMATION FOR SEQ ID NO: 4 0 
20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 9 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

-5 MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
fix) FEATURE: 

;D) OTHER INFORMATION: AciIS3-204 
30 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 0 . 

GCGGTTGGCA CCACCAGCGA NAATCAGCAG GNDCCCGCCA CGTTGCAAGA 50 
CGGCGGCAAT CTTCGCCTGT CGCTCACCGA CTTTCCGCCC AACTTCAACA 100 
TCTTGCACAT CGACGGCAAB AABGCCGAGG TCGCGGCGAT GATGAAAGCC 150 
ACCTTGCCGC GCGCGTTCAT CATCGGACCG GACGGCTCGA CGACGGTCGA 200 
35 CACCAACTA 209 
(2) INFORMATION FOR SEQ ID NO: 41 
( i ) S EQUENCE CHARACTER I ST I CS : 
(A) LENGTH: 166 



GTGNGCGCGC CNTCGAGCAN GTCTTGGCNG 
CCCGACATCC GGTACACACC GAACCCCNAA 
CTGGTAGAAA GGGGAAATCG CCAGTGCTGA 
TTGAKCCKTT TKGCGAKCGT CKCCGTAGTG 
5 GTACTACCTG CGAATTCCGA GTCTGGTGGG 
AAGGCCGACT TGCCCGNATC GGGTGGCCTG 
CTACCGCGGT ATCACCATTG GCAAGGTTAC 
AGGGCNGCAC GANGTGACGA TGAGCATCGC 
GTCGATGCCT NCGGCGAACG TGCATTCGGN 

10 ACATCGACCT NGTGTCCACC GGTGCTCCGG 
AC AG AC CATC ACCAANGGCA CCGTTCCCAG 
ACAANTCCSA ATCNGCGGGT TGGCCGCATT 
TTGCTGCTCG ACGAGACNGC GCAAGCGGTG 
TTGCAACGGT TGGTCGATTC CACTCAAGCG 

15 CAACATTGGC GACGTCAACG AC AT CAT CG A 
ACAGCCAGGT CAACACGGGT G AT CAG AT CG 
ACAATSTGGC CGCACAGACC GCNGACCAGG 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
10 (D) OTHER INFORMATION: AciIS3-I66 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 38 

AGGCCAATCG NTGATGCGAC TCGAACGGGT TCGGCGCCGA TGACTGTTTC 5 0 
GCGAAGTTCA TCAGCACCCT CGTTGGCGCG AAGGGCACGA CGGTGTACCG 100 
GWWRYSAMKA CRCYGCYATG AGTYTCTGCS TGTATTGCGG TGCSGAGCTT 150 

15 GCCGACCCGA CCAGGTGCGG KGCGTGNCGG CSCAKACWAG ATTGGTTCAA 200 
CCTGGCNATC GGACCNACGA CGCCGACGGT CGGCGCCGCG ACGACGGCAN 250 
ACGGNATNGC GACCCGANTC CNYACCNGGT CGCCACGAGG GACGNCTACT 3 00 
TCGTCGCCNG GCAGCCGACC GANCTCGTTN NNCGCGASGN CGACGCCGAA 350 
GCCGTTGACC CACTTGGTCA GCAGCAGCTG GNNATCANGN TCANGGTGCC 4 00 

20 GTTNNGGTGT TTCGCCGTCA GCGGTGTCGG GGTGGGTGCG TTCTGGGCAC 4 50 
CGTCGACTGT GGTGGGCGCT TGCGGGCGTG GTGGCGTTTC TCGGGCTGGT 500 
GGGAGCCGGT GTCGTCGGGA CGCTGTTCCT GAATCGAGAC CGGGAGTCCA 550 
TCGACGACAA GTACCTCGCN CCTTGAGGCG GTCCGGACTC ACCGGTGAGT 6 00 
TCAACTCCGA CGCGAACGCC ATCGCCCGCS GCAAGCAGGT GTGCCGCCAG 650 

:s TTGCANASAC GGTGGCGAAC AGCNSA 676 
(2) INFORMATION FOR SEQ ID NO : 3 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 853 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

35 (ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-167 
(xi) . SEQUENCE DESCRIPTION: SEQ ID NO 39 
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(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-281 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 43 

5 CGGYCCGNNC AAYYYGNCGC GCHNCGGYGY AGAGGTCGNY AAGGTCGCCA 50 
AGGTAACGCT GATCGAYGGG NACANGCAAG TATTGGTGNA CTTCACCGTG 100 
GHTHGCTHGC TGTYAGC 117 

(2) INFORMATION FOR SEQ ID NO: 4 4 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 3 85 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: genomic DNA 

15 '.vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: BsaHISl-21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 44 

20 GAACCTCCTC GCCCGCGCTT GGCCTAGCAT TAATCGACTG GCACGACAGT 50 
TGCCCGACTG GGTACACGGC ATGGACGCAA CGCGAATGAA TGTGAGTTAG 100 
CTCACTCATT AGGCACCCCA GGCGTTGACA CTTTATG CTT CCGGCTCGTG 150 
TAGTTGTGTG GGAATTGTGG AGCGGATAAC AATTTCGACG ACGAGGAAAC 200 
AGCTGTAGAC ATGGATTGAC GAATTTGAAT ACGACTCACT ATAGGAATTC 250 

25 GAGCTCGGTA C G CGGGG AT C CTCTAGAGTC CTTCGCCGCG GGTCGCCACC 3 00 
ATCAGGGCCA GTGCGATCGC AAGCGCGGGG TACCGGGCGC CATAGTCTTC 350 
AGCATCGGCG TGTTGACCGC AGAGACCGGA CGGGG 3 85 

(2) INFORMATION FOR SEQ ID NO: 4 5 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 2 85 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPISl-12 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(iii MOLECULE TYPE: genomic DNA 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ixi FEATURE: 

(D) OTHER INFORMATION: AciI#3-206 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 41 

10 AG AT C G T C AG TGAGCAGAAC CCCGCCAAAC ZGGCCGCCCG AGGTGTTGTT 50 
CSAGGGCTGA AGNCNCTGCT CGCGACGGTC GCTGCTGGCC GTCGTCGGGA 10 0 
TCGGGCTTGG CTCGCGCTGT ACTTCACGCG GGCGATGTCG NCCCGCGAGA 150 
TCGTGTATCA TCGGGT 16 6 

■;2) INFORMATION FOR SEQ ID NO: 42 
15 { i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 221 
i B ) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

:o (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobaccer lum tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Acil#3-214 
ZS (xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 2 

CCAGNTCCTC NN AT AT CG AC ACCCTCNACN AAGACCGCTT CGCGAGATCA 5 0 
ACNCTCAGAT ATNCNNACTA TCNCCNNTNC ACGCACACCT CAA CAT NANA 10 0 
NAATNGAACT ATNGNCTTCG CCTCACCACC AAGGTTCAGG TTANCGGCTG 15 0 
NCGTTTKCTC TKCGCCGGCT CGAACACGCC ATCGTGCGCC GGKACACCCG 200 
30 GATGTTTGAC GACCCGCTGC A " 221 

(2) INFORMATION FOR SEQ ID NO: 4 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) ■ MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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TGGGTGATCG CAGTGAGCAG GTCGACCGCC TATTGGTCAA CGCTAAGACC 200 

CTGATCGCCG CGTTNCAACR GASNGCGCCG CGCGGTCGAC GCCCTGCTGG 250 

GGAACATCTC CGCTTTCTCG CCCAGGYGCA AAACCTTCAT SAACGACAAN 3 00 

CCGAACCTGA ACCATGTGCT CGAGCNGCGC ATCCTSACSA CCTGTTGGTS 3 50 

5 GACSGCAAGG AGGATTTGGC TGAAANCCTN ACGATSTTGG- GCAGAKTCAG 4 00 

CG 4 02 

(2) INFORMATION FOR SEQ ID NO : 4 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 
io (B) TYPE: nucleic acid 

(Ci STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-200 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 48 

AGNCCGTGCA CTGGAANCTT CGGCTCAGWT GTCTCCGATG TGGACGGCAA 50 
SGCTGATGAT CTCCCGGTTG GAAGTCGANT C GAT KAS AAA TGGCTTGGCG 100 
GCTGGTGGTG TTCGATGCCT GGCACCRACT GGCBACGATC NSCGCCTGGN 150 
CGCGATCGGC GCTTAGCTCG GCTGGNNCCC TGTGGTGGGT TTCGACGTGC 2 00 
TCGGTGTTGG TGCTGCTGGT GGTCGAAGGT GTGGCAATCA ACGTTCTGGC 250 
TGTTGCGTCG TGATTCGGTA ACCGTCGGTA CCGACGACGA TGCGCCCGGG 3 00 
CTGCGACTGG CCGTTGTCTT CCTGTGCNNG CCGCCGCGAT CTCGGCGGCN 3 50 
GTGGTGACTG GGTACCTGCG CTGGACGACA CCGGACCGCG ACTTCAATCG 4 00 
GGATTCCCGG GAAGTGGTGC ATCTTGCCAC GGGGATGGCC GAGACGGTCG 4 50 
CGTCATTCTC CCCGAGCG 468 

(2) INFORMATION FOR SEQ ID NO: 4 9 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 417 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE : genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) ■ FEATURE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 5 

CCCGCAGCAG TACCCGCAGN CCCACACCCG CTATNCGCAG CCCGAACAGT 50 
TCGGTGCACA GCCCACCCNA GCTCGGCGTG CCCGGTCAGT ACGGCCAATA 100 
CCAGCAGCCG GGCCAATATG NCCAGCCGGN ACAGTNACGN CCAGCCCGGC 150 
5 CAGTACGCNA CCGCCCGGTC AGTACCCCGG GCAATACGGC JGCGTATGNCC 200 
AGTCGGGTCA GGGGTCGAAG CGTTCGGTTG CGGTGATCGG CGGCGTGATC 250 
GCCGTGATGG CCGTGCTGTT CATCGGCGCG GTTCT 285 

(2) INFORMATION FOR SEQ ID NO: 4 6 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 186 

(B) TYPE: nucleic acid 
: C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 
•ii) MOLECULE TYPE: genomic DNA 

15 ivi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
{ ix) FEATURE : 

(D) OTHER INFORMATION: HinPI#l-142 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 6 

20 GCNCGTGNCC GTGCCGCCCG GTTGAACGTG AGCNGCTGNC NATNGCCCCA 50 
GCCGAGACGA GAACGTCCCC GAGGAGTATG CAGACTGGGA AGACGCCGAA 10 0 
GACTATGACG ACTATGACGA CTATGAGGCC GCAGACCAGG AGGCCGCACG 150 
GTCGGCATCC TGGCGACGGC GGTTGCGGGT NCGGTT 18 6 

(2) INFORMATION FOR SEQ ID NO: 4 7 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 02 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-144 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 4 7 

GTCGCTGAAT GTGTTGTCGG AGACCGTGAT CAGACCTATC CGCACCTGAG 50 
CGCCGCCTCC ACGGGTGGCT AAGTTCTCCG ACACCATCGG CAAGCGCGAC 100 
GAGCAGACTC ANGCACCTAC TAGCCCAGGC CAACCAGGTG GCCAGCATCC 150 
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(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#2-145 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 51 

CGGCCCGGCG GCGCCCTGGT GAAGCTTGGA GAATGGGTGA GCGCAGCTGC 50 
5 CCACCACACG GGACCGGTGC GGACGCGSTG ACGCGCCTGG -TGGTCAGCAN 100 
CNTGGCCGGT CTGCTGTTGT ATGCCAGCTT CCCGCCGCGC AACTGCTGGT 150 
GGCGGCGGTG GTTGGGCTNC GCATTGCTGG CCTGGGTGCT GACCCACCGC 200 
GCGACGACAC CGGTGGGTGG GCTGGGCTAC GGCCTGCTAT TCGGCCTGGT 250 
GTTCTACGTC TCGTTGTTGC CGTGGATCGG CGAGCTGGTG CNCCGGGCCC 3 00 
[0 TGGTTGGCAC TGNCGACGAC GTGC 324 
(2) INFORMATION FOR SEQ ID NO: 5 2 
SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 229 

( B ) TYPE: nucleic acid 

»5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
20 (ix) FEATURE: 

(D) OTHER INFORMATION: HinPIS2-150 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 52 

CCAGGCTAGC ACGTATGCTC CGGCTCGTTG TGTGTGGAAT GTGAGCGGAT 5 0 
GACANKNCAC ACAGGADAYA GCTATGACNA TGATTACGCC AAGCTATTTA 100 

25 GGTGABACTA TAGAATAYTC AAGCTATGCA TCCAAYGCGT TGGGAGCTCT 150 
YCCATATGGT CGACCTGCAY GCGGCCGCAC TAGTGATTST THGCGCCGGC 2 00 
NYGCWGCGGC NYAYGACCGC YAAYACCAC 22 9 

(2) INFORMATION FOR SEQ ID NO: 5 3 
(l) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 2 93 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: KinPI#3-28 
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(D) OTHER INFORMATION: HinPI#2-23 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO 4 9 



GTCCAAGGCC 


GTAGCCCACC 


TCCTGGAAGT 


CGTACCACGT 


CGACTCGACC 


50 


AGGACGGCTG 


CAGTCAGCAC 


TTCGTCAACC 


CGCGATCATC 


AACGTGCACC 


100 


TACGGCAGTG 


TGACGCACCC 


CGGACCATCG 


CAC i uov^uou 


GGTTCACACG 


150 


CCGAACACTG 


CTGACCGCAC 


TGGATCTGCT 


GGTCGCATGC 


ACCACTTCAA 


200 


GGTGGTGACG 


TACCTCAAAA 


TGGGTTTCCC 


GTTGTCCACC 


GAGGAAGTCC 


250 


CGCTGATTCA 


TGGGCAATAA 


CGCTCCCTAT 


CCGCAGTGTC 


ACCAGTGGGT 


300 


GCAAGCGGCG 


ATGGCCAAGT 


TGGTCGCTGA 


CCACCCCGAC 


TACGTTTTCA 


350 


CAACCTCGAC 


TCGACCGTGG 


AACATCAAAC 




GATGCCAGCA 


400 


ACCTATGTCG 


GGATCTG 








417 



10 



\2) INFORMATION FOR SEQ ID MO: 5 0 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 
15 (3) TYPE: nucleic acid 

{ C ) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

:o (A) ORGANISM: Myccbacter ium tuberculosis 

(ix) FEATURE: 

<D> OTHER INFORMATION: HinPI#2-143 
fxi) SEQUENCE DESCRIPTION: SEQ ID NO 5 0 

CGGTCGAGCC GATGAACGTC TGCAGTTCAC CGCAACCACG CTCAGCGGTG 50 
25 CTCCCTTCGA TGCGCAAGCC TGCAAGGCAA TGCCGCGGTG TTGTGGTTCT 100 
GGACGCCGTG GTGCCCGTTC TGCAACTGTC AGAAGCCCCC AGCCGCAGCC 150 
AGGTAGCGGC CGCTAATCCG GCGGTCACCT TCGTCGGAAT CGCCACCCGC 2 00 
GCCGACGTCG GGGCGATGCA GAGCTTTGTC TCGAAGTACA ACCTGAATTT 2 50 
CACCAACCTC AATGACG CCG ATGGTGTGA 279 
30 (2) INFORMATION FOR SEQ ID NO: 51- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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10 



15 



(A) LENGTH: 117 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Hin?I#3-34 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 5 5 

^mov-v-ACCTC GTTCGCCGGC GACATCGACT ATCAGCCGAC CZGGCCACTG 5 0 
CTGACCTGAT CGCCAACAGC TGGAGGCCCT ACCGGCTGCA GTTCAATTCA 100 
l. wCoCTGCGG GT CGGCG 1 17 

(2) INFORMATION FOR SEQ ID NO: 5 6 
ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 242 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

20 <ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#3-41 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 55 

AGGTGTCGTG CTTCATGCCT GGCGCCCAAT CCAGTTTCTA CACCGACTGG 50 
TATCACCCTT CGCAGACAAA CGGCCAGAAC TACACCTACA AGTGGGAGAC 100 
CTTCCTTACC ACACAGATGC CCGCCTGGCT ACAGGCCAAC AAGGCGTGTC 150 
CCCCACAGGC AACGCGGCGG TGGGTCTTTC GATCTCGGGC GGTTCCGCGC 2 00 
30 TGACCCTGGC CGCGTACTAC CCGCAGCAGT TCCCGTACGC CG 24 2 

(2) INFORMATION FOR SEQ ID NO: 57 " 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 53 



CCACACAACA 


CAAATCTACG 


TCGTAATGCA 


GTCGTAAGTC 


CATCCGACGT 


50 


CGATGGCAAG 


GACAGCACCC 


GACGGCCAAC 


GG CAT AT AC A 


TCGTCGGCTC 


100 


GCCGGTCACA 


AGCACATCAT 


CATGGACTCG 


TCCACTACGG 


CGTACCCGTC 


150 


AACTCGCCCA 


ACGGATATCG 


CACCGATGTC 


GACTGGCCAC - 


CCAGATCTCC 


200 


TACAGCGGTG 


TCTTCGTGCA 


CTCAGCGCCG 


TGGTCGGTGG 


GGGCTCAGGG 


250 


CCACACCAAC 


ACCAGCCATG 


GCTGCCTGAA 


CGTCAGCCCG 


AGC 


293 



(2) INFORMATION FOR SEQ ID NO : 54 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 816 

IB) TYPE: nucleic acid 
i'C) STRANDEDNESS : double 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacce r ium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#3-30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 54 



CGNCGYCGSC 


GNGCSCTAYC 


GGTGCGGGAG 


GGTACAVCCA 


AGCANTCCGG 


50 


GACCGGCCGT 


CYCGCYGGGA 


ACGCCGTGCT 


CCTACAYACC 


GGCGRCGGGC 


100 


GCGTTGCCAC 


GSCCCGACAC 


CCCACTACCC 


NGNCGCGGGC 


GCCACCRTTG 


150 


GCCCGTTNMG 


GTGGACCCGA 


NCTTCCCGGC 


ACCGCTCGAT 


GTCCAGCCGT 


200 


CGCCGCCTAA 


TCCCGATGGG 


CCGCMGCCGA 


C KCCGGG CAT 


CCTAAGTGCT 


250 


GGGCGGCCGG 


GCGAGCCGGN 


TCCGGNTGTT 


CCGGCATACC 


GWTGCCSYTG 


300 


CCGNCGAACN 


TGCACGCACC 


CAACCGCTTG 


AGCCGTTTCC 


TGACGGGACG 


350 


GGAGGTAGCA 


ACCAATGAGC 


ACCATCTTCG 


AYATCCGSAG 


CCTGCKACTN 


400 


GYCGAWACTG 


TCTNGCAAAG 


GTAGTGGTCG 


TCGGCGGGTT 


GGTGGTGGTC 


450 


TTGGCGGTCG 


TRGCCGNCTG 


NCRGCCGGCG 


CGCRGCTCTA 


CCGGAAACTG 


500 


ACTANACTAC 


CGTGGTCGCR 


TATTTTCTST 


GAGGCGCTCG 


CGCTGTACCC 


550 


AGGAGASAAA 


GTCCAGATCA 


TGGGTGTGCG 


GGTCGGTTCT 


ATCGACAAGA 


600 


TCGAGCCGGC 


CGGCGACAAG 


ATGCGAGTCA 


CGTTGCACTA 


NCAGCAASAA 


650 


ATACCAGGTG 


CCGGCCACGC 


TACCGNYGNW 


CGMTCCTCAA 


CCCCAGCCTG 


700 


GTGGCCTCGC 


GCACCATCCA 


GCTGTCACCN 


NCGTACACCG 


GCGGCCCGGT 


750 


CTTGCAAGAC 


GGCGCGGTGA 


TSCCAATCGA 


GCGCACCCAG 


RTGCCCGTCG 


800 


AGTGGGATCA 


GTTGCG 








816 



(2) INFORMATION FOR SEQ ID NO: 5 5 
(i) SEQUENCE CHARACTERISTICS: 
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(ixj FEATURE: 

<D) OTHER INFORMATION: HpaII#l-10 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 5 9- 

CCACCANNNA ACRRCACAGC TCCGGCCRRC CGTNCGCAGG CCACCCGCAN 50 
5 CGTAGTGCTC AAATTCTTCC AGGACCTCGG TGGGGYACATy .CCGTCCACCT 100 
GGTACAAGGC CTTCAACTAC AACCTCGCGA CCTCGCAGCC CATCACCTTC 150 
GACACGTTGT TCGTGCCCGG CACCACGCCA CTGGACAGCA TCTACCCCAT 200 
CGTTCAGCGC GAGCTGGCAC GTCAGACCGG TTTCGGTGCC G 241 
(2) INFORMATION FOR SEQ ID NO: 6 0 
10 (i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 3 
iSi TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 
15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
( ix) FEATURE : 

(D) OTHER INFORMATION: HpaIItfl-13 
20 (xi) SEQUENCE DESCRIPTION: SEQ ID NQ 60 

CCGGCGGATC TGCGTGACGA NTGTATNCCA CGGNACTACC CGCGGTCCTT 50 
CCTCNANTNC CGCCGGNCCA GNCGCAGNCT NCNGATGTCC NGCTATAACC 100 
TGCGCGATCG CCGCCGGGCT GCCCGACAAC ACGGTGNGCG CCGCCGCTGC 150 
TTCCGCCAAT 7CTGGGTGNC GGCATNCCGG CAGCGCCCGG CCCAGCACTG 200 
:5 AGAGGGGGAC GTTGATGCGG TGGCCGACGG CGTGGCTGCT GGC 24 3 

(2) INFORMATION FOR SEQ ID NO: 51 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 34 6 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-825 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 61 
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(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HpaIISl-3 
(xii SEQUENCE DESCRIPTION: SEQ ID NO 57 

5 TGCTGCAGAT AGCCAAGGAT CCAGTCGTGA TTGATATCAC -GTCTTTCCAG 50 
TGAATTGAAG TTTGGCTATC AAAGGGTGAA CTTSAAAGAC GGCACACTGA 100 
CCTATGATGG TGCCGATCCG GAGCGCAAGC GCGCCATGGT TTCCAAGCCA 150 
GAGGGCAAGN ACAAGTACGG CGAAGAGCTG GTCGGGCCGG TGCGCGGGCT 200 
CAACACCGAG GACCGGACCT ACCTGAATTT CGACAAGGTC GAGACGTTGG 250 
10 GCAGCAGCAC CGAAATTCCG GTGCTGGTGC TGCCGTCCGG CAAGCGTATC 3 00 
GAATTCCAAA TGGCCTCAGC CGATGTGATA CACGCATTCT 3 40 

[2) INFORMATION FOR SEQ ID NO: 53 
1 ! SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 2 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{DJ TOPOLOGY: linear 
(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

:o (A) ORGANISM: Mycobacce r ium tuberculosis 

(ix) FEATURE : 

(D) OTHER INFORMATION: HpaIISl-3 
•xii SEQUENCE DESCRIPTION: SEQ ID NO 53 

CNGACTCCAA CNAGTGCGNT CAANCNGNTG TNCCNGACAA GAAGGTTCCT 5 0 
25 ACATCCGCAA NTCGGTGNAA NGCCACTGTG GATGCCTACG ACGGAACGGT 100 
CACGCTGTAC CAACAGGACG N AAAAGG AT C CGGTGCTCAA GGCCTGGATG 150 
CAGGTCTTCC CCGGCACGGT AAAGCCTAAG AGCGACATTG CGCCGGAGCT 200 
TGCCGAGCAN CTGCGGTATC CCGAGGACCT GTTCAAGGTG CAGCGCATGT 2 50 
TGTTGGCCAA AT 2 62 

30 (2) INFORMATION FOR SEQ ID NO: 5 9. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 241 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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CCCAGGCCAT CGTCGCCGGC CGCCACGAGA TCGAGGACGA GCCCGCGGTG 1950 
GTGGTGTGGC TGGCGTCCGG CTTGGCCGCC GAGACATTCC AGCTGGACTT 2000 
TGTCNGTACC GGCTCGGGTG CCCTGATCAC CGGTTATCGG TTCGACCGNA 2050 
CCGCCCGGGA TCTGCATCTG CTGCTGCCGG ACCCGTACAC ATTCCCGTCG 2100 
5 AACCTGCTCA TCGAGCACCC CAACACCGAC CTGCCGGGCA- CCGCNGTCGT 2150 
GGGCGGCGNT GGTGAGCGGC GGGCGCCGGC GGGGCGACAC CCGGSTGTKC 2200 
CGCGATCACG ACGTGCTCAC CTCCGGMGTC GTCGGCGTGC GCCTGCSCGG 2250 
GATGCGCGGT GTMCCGGTCG TGTCGCAGGG TTGNCGGCCG ATCGGCTACC 23 00 
CATACATCGT CACCGGMGCG GACGGCATAC TGRKCACCGA GCTCGG 234 6 

10 (2) INFORMATION FOR SEQ ID NO: 62 

(ii SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 841 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobac cerium tuberculosis 
(ix) FEATURE: 
~0 (D) OTHER INFORMATION: Acil#4 3 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 62 






CGTTACCCGC 


TTTACACCAC 


CGCCAAGGCC 


AACCTGACCG 


CGCTCAGCAC 


50 


CGGGCTGTCC 


AGCTGTGCGA 


TGGCCGACGA 


CGTGCTGGNC 


NAGSCCNANS 


100 


CCAATGNCGG 


MMTGCTGCAA 


NCGGNTNCNG 


GCCANGCGTT 


CGGACCGGAC 


150 


GGACGCTGGN 


CGGTATCAGT 


CCNGTCGGCT 


TCAAANC CG A 


NGGCGTGGGC 


200 


GAGGACCTCA 


AGTCCGRRCC 


CGGTGGTCTC 


NAAACCCSGG 


CTNGTCAACT 


250 


CCGATNCGTC 


GCCCAACAAN 


CCCAACNGCC 


NGCCATCANC 


GACTCCKCNG 


300 


GCACCGCCNG 


AGGGAAGGGY 


CCGGNTCGGG 


ATTCAACGGG 


TTGGCRWCGC 


350 


GGCGCTGCCG 


TTCNGRATTG 


GAYCCGGCAN 


CGTACCCCGG 


TGATGGGCAG 


400 


CTNACGGGGA 


NGAACAACCY 


GSCCSSSACG 


GCCACCTCGG 


CCTGGTACCA 


450 


GTTACCGCCC 


CGCAGCCCGG 


ACCGGCCNGC 


TGGTGGTGGT 


TTCCNGCGGC 


500 


CGGCGCCATC 


TGGTCCTACA 


AGGAGGACGG 


CGATDTCATC 


TACGGCCANG 


550 


TCCCNTGAAA 


CTGCAGTGGG 


NCGTCACCGG 


CCCGGACGGC 


CGCANTCCAG 


600 


CCACTGGGGC 


AGGTATTTCC 


GANTCGACAN 


TCGGACCNGC 


AACNCCNGCG 


650 


TGGCGCAATC 


TGCGGTNTNT 


CCGCTGGCCT 


GGGCGCCGCC 


GGNANGCNCG 


700 


ACGTGGCGCG 


CATTGTCGCC 


TATGACCCGA 


ACCTGAGCCC 


TGAG CAATGG 


750 


TTCGCCTTCA 


CCCCGCCCCG 


GGTTCCGGTG 


CTGGAATCTC 


TGCAGCGGTT 


800 


GAKCGGGTCA 


GCGACACCGG 


TGTTGATGGA 


CATCGCGACC 


G 


841 
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GCGCTGNCAT TCGNACTTCG 
NCTACGACGG CGCCACCGGC 
GCGCTGGCCA TGATGGTGTT 
AGTGANACCG AGGGTGATGC 

5 CACGGCTGAC CGTGTTCGGC 
ATTGCGGCCG GAGTCGAATT 
CGCGTTGTTC GTCGTCGTCG 
TNCNCATTCC GCGCTGGGTC 
TTGAGCTACC AC CGGG AT AG 

10 AAGTCAAGAA CCTCGGCGGA 
ATTACCTCCC TGTGGGGTAA 

* * . jiAl ^Ljuu^j i i iU 

TTCGCCGGCA ATTTCACCAG 

15 GC KG GTNGTG CGCTGCACCG 
CGGTGGCCGG CAGCCTGGCA 
GGGTCCAGTG CCATTGCTAA 
CCTGCCCGAG GAGTCGCGGG 
TTCAGCTGGC CTGGGTGCTG 

20 GAGCTGTGGG TGGGCTTCAC 
GGCTCAGACC ATCGTCAGCT 
GCGGTAATCG GCCCGTGATG 
GCGGTGGCGC CGNAGTGAAG 
GTGATTCTGC TCTCGGTGGC 

25 CGGACACGGT CCGCAGCAAC 
TGACCCGCGT GGGGCCCTAT 
TGTCAGACCC CGCANGCGCA 
CGTGCAGCTC TCGGTACCCG 
TGCAGGTATA CCAGGACCCC 

30 GACACCCGGT TGGCGGTCAC 
GCTGACCGGG ATTGTCGTGC 
GTGAACTACG CGACGTNCGC 
GACGAGGCCG CGGCTCGACG 
CGAAGAGCCG CCGACACCCG 

35 GATCGGAGTC GGGGTTTGCA 
AGGCTGCGGG CCAGGCGCGC 
GCGGTGTTGC TTGGATCGCG 
GAGCGCGGTG CTGCAGATGA 



PCT/US96/10375 

- 61 - 

GACNGCGTTN GCGGTGGTGC TGATCATGAA 50 
AGCTTCCCGT CATGGGTGCT CTATCCCTGT 100 
CTCGAATKCG TTCAGCGTNC TGCGCAGCGC 150 
CGCCAACCAT CGACTTGGTC CGGGTCAACT 200 
CTGCTCGGCG GCACCATCGC -TGGTGGCGCG 250 
CGTCTGCACC CACCTGTTCC AGCTGCCGGG 3 00 
CGATCACCAT CNNTNNNGCT TCGCTGTCGA 3 50 
GAGGTGACCA GCGGTGAGGT CCCGGCCACA 400 
GGNCAGACTA CGGCGACNGC TGGCCGGAGG 4 50 
ACACTCCGAC AAC CGTTGGG CCGCAACATC 500 
CTGCACCATC AAGGTGATGG TCGGCTTTCT 550 
TCGCCAAGGC GCACGAAGCC AACGGGTGGG 600 
CTGATCGGCG CGGCGGCCZC GG7CGGCAAC 650 
CGCACGCCTG CAGCTAGGCA GGCCAGCTGT 700 
TGCTAGTTAC CGTGTTAGCC ATCGCGGCCG 750 
GCGACAGCNA TTGCCACCCT GATCACGGCA 8 00 
AGCCTCGCTG GACGCCTCGT TGCAGCACGA 850 
CATCGGGGTT TGGGCGTTCC GAGT CG ACT C 900 
GGCGGCGCGG TGGGCGTGTT GGTGTACACC 950 
TGCGGTGAGC GCGCTGCTGA TCCTGGGTCT 1000 
TCCGCGGCGA TTCGCTGATC CCTGGCCTGG 1050 
GCCGAGCAAG AAACCACCCG TCGTGGTGCG 1100 
CGCGGTGTCG CAACGCTGCC GGTGATCCTG 1150 
GGCCGGGGCC GGTGCATGGC TGCTAGTACG 1200 
CCGAGATCAG CGCTTACTCG CACGGGCACC 1250 
TTGTACTGCA ACGTGGTCGA CCTCGACGAC 1300 
GGGCGAATTG CCGGTAAGCG AACGCTATCC 13 50 
AAGTCATTTC CCGGGCGCCG TGGCGTTTGC 14 00 
GCCAACACCA CCAGCACCTT GTTTCGGCCG 14 50 
CATCCCCACT GTCGACCCGC AGCGCGGGCG 1500 
AGTTGCTGAC GTTGGTGGTC GACCACTCGG 1550 
ACGCGGAATG GTCGGTGCGC CTTATCTTTT 1600 
NCNCCTTAAG CGCGGTCGGC GCCAACGGTC 1650 
GGGCACATCG GCGCATCATG GAACTGTGCG 1700 
CCACGCCCGA CGCGCGGCAG GCCGCGGTGG 1750 
GACGAGCTGG CGGGTGAGGC GC£GTCGCTG 1800 
TGCACACACC GACCGGGCTG CCGACGTCCT 1850 
TCGACCCGCC CGCGCTTGTC GGTTGCATCG 1900 
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GNATNCGACG GCAATAAACA CACGCTCTGG CCACGTTTCT TGGCGGGGAA 250 
AGGGGTGATG CTATCGGAGC CAATGGTATC GCGACAACAC TTGCAGATGC 3 00 
CGCCAAGGCC GATCACGCTA ATGACGGATT CGGGGCCACA AACGTTCCCC 3 50 
GTTCTGGCGG TTTTCTCTGA CTACACCTCA GATCAAGGTG TGATTTTGAT 4 00 
5 GGATCGCGCC AGTTATCGGG CCCATTGGCA GGATGATGAC-GTGACGACCA 450 
TGTTTCTTTT TTTGGCNATN CGGGTGCGAA TAGCG 48 5 
(2) INFORMATION FOR SEQ ID NO: 6 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 9 
io (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
<vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: Aciltfl-264A 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 5 

GGCGAGGTCA GTGAAGCCGA GGAAGCGGAA AGGAGCGCCC AATACGGAAC 50 
CGCCTCTCCC CGCGCGTTGG CCGATTCATT AAATGCAGCT GGCACGACAG 100 
GTTTCCCGAC TGGAAMGCGG GCAGTG AG CG CAASGCAATT AATGTGAGTT 150 
AGCTCACTCA TTAGGCACCC CAGGCTTTAC ACTTTATGCT TCCGGCTCGT 2 00 
ATGTTGTGTG GAATTGTGAG CGGATAACAA TTTCACACAG GAAACAGCTA 2 50 
TGACATGATT ACGAATTTAA TACGACTCAC TATAGGGAAT TCGAGCTCGG 3 00 
ZS TACCCGGGGA TCCTCTAGAG TCGCTTCGGT TGGCGGCGAC CAGCAGTGGA 3 50 
TCCACGGTGG CCGCCCGCGC GGCDTCATAC ACCGCCGCGG CCTCCTTGGC 4 00 
CTGTGCGGCC SGCTTAGCGC GCGTGTTGCT GCCGTGCTTA GCCANCTGGC 4 50 
ATAGGGGGCT GCCGCGCGC 4 69 

(2) INFORMATION FOR SEQ ID NO: 6 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 90 

(B) . TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
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(2) INFORMATION FOR SEQ ID NO: 6 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 71 

(B) TYPE: nucleic acid 

5 (C) 3TRANDEDNESS-: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
10 (ix) FEATURE: 

( D) OTHER INFORMATION: Ac iISl-2/23/9 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO €3 

GGCAGCGGTG ATCGGCTGAC CGGCAGTGAT CACCAACCTC AACGTGGTGC 50 
TGGGCCTCGC TGGCGCTCAC ACGATCGGTT GGACCAGCCG GTGACGTCGC 100 

15 TATCAGCGTT GATTCACCGG CTCGCGCAAC GCAAGACCGA CATCTCCAAC 150 
GCCGTGGCCT ACACCAACGC GCCGCCGGCT CGGTCGCCGA TCTCTGTCGC 200 
AGGCTCGCGC CGTTGGCGAA GGTGGTTCGC GAGACCGATC GGGTGGCCGG 250 
CATCGCGGCC GCCGACCACG ACTACCTCGA CAATCTGC7C AACACGCTGC 3 00 
CGGACAAATA CCAGGCGCTG GTCCGCCAGG GTATGTACGG CGACTTCTTC 3 50 

20 GCCTTCTACC TGTGCGACGT CGTGCTCAAG GTCAACGGCA AGGGCGGCCA 4 00 
GCCGGTGTAC ATCAAG CTGG CCGGTCAGGA CATGCGGCGG TGCGCGCCGA 450 
AATGAAATCC TTCGCCGAAC G 4 71 

(2) INFORMATION FOR SEQ ID NO: 6 4 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 485 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: Aci I#l- 229/264 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 64 

35 KGTCTCGCGN CCTTNACATC CGGTCGCCNN RCGGTNATCT GCCTGTGGAT 50 
GCCGTCCGGA NGTATNANCN AATGGCCANG AGTNCGTGAC NGCAGNTATG 100 
GNCKCGGNTA TAGTTCCGTT TTGCCCNGGA CTNGGNGCGT GAGGTGGAAC 150 
TAATGGCGGT GTCGGGTGAT ATTTCCGACG GCAAGNCGAC CATATAGGTG 200 
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CACTGCGCCG AGATGTGTGG CACGTATCAC TCGATGATGA ACTTCGAGGT 950 

CCGCGTCGTG ACCCCCAACG ATTTCAAGGC CTACCTGCAG CAACGCATCG 1000 

ACGGGAAKAC AAACGCCGAG GCCCTGCGGG CGATCAACCA GCCGCCCCTT 1050 

GCGGTGACCA CCCACCCGTT TGATACTCGC CGCGGTGAAT TGGCCCCGCA 1100 

5 GCCCGTAGGT TAGGACGCTC ATGCATATCG AAGCCCGACT '.QTTTGAGTTT 1150 

GTCGCCGCGT TCTTCGTGGT GACGGCGGTG CTGTACGGCG TGTTGACCTC 1200 

GATGTTCGCC ACCGGTGGTG TCGAGTGGGC TGGCACCACT GCGCTGGCGC 1250 

TTACCGGCGG CATGGCGTTG ATCGTCGCCA CCTTCTTCCG GTTTGTGGCC 1300 

GCGGAT 13Q6 

10 (2) INFORMATION FOR SEQ ID NO : 6 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 759 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
»5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
2° (D) OTHER INFORMATION: AciI#2-823 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 68 

GGTGCCTGCC ATCGGTTCGC TGNGCCACNG CTGNCNNATC TTTGGTSTGT 50 

TAGAGGTNWW CCGCGCGGAT RGCNCANTCC TGTTGGNGGG GGTTRTCGCC 100 

ACGATTGCCG CCCGCGCTGA ACCCGACGAC GCCGATGCCC TGCCCACCAC 150 

25 GGATCGGCTG NNMMCANCCG AGCGAACCGT GCAGNATGCN TNTKGTTGAC 200 

GAGCCTGCTG GCGCCTTCGC NGGCNCTCGG CGACCATCGG TGCCATCGGA 250 

ACCGCCGTNC GCAACCCACG GCATCCACAN GSTCCANGCA TGGCGGTATC 300 

GCGNTTGGCC GNCGTCACCG GTGCGCTGCT GCTGCTAYGA GCACGTTCAG 3 50 

CAGACACCAG AAGGTCACTG NTGTTTGCCA TCTGTNGGAA TCACCACCGT 4 00 

30 TGCAACGGMA NTTGTACCGT CGCCGCGGAT CGGGCTCTGG AACACGGGCC 4 50 

GTGGATTGSC GCGCTGACCG CCATGCTGGT CCNGCCGTGG CAANTGKKTT 500 

TGGGCTTCGT NGCTCNCCGC GTTGTCGCTC TCGCCCGTCA CGTACCGCAC 550 

CATCGAATTG CTGGAGTGTC TGGCGCTGAT CGCAATGGTT CCATTGACCG 600 

CTNTGGSTAT NNNNNCGCCT ANCAGSSSCS TTCGCCACCT CGACCTGACA 650 

35 TGGACATGAC CACNGTCCCG TNACCCTGCG CCTGNCTNGG TGGTMTCAGC 700 

GNCNNNTCGY SACGCTGTCT GGSWTGGSRM RCGCNCGGTT GCGCCACGCG 750 
GTTTCGCCG 

(2) INFORMATION FOR SEQ ID NO: 6 9 



759 
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(D) OTHER INFORMATION: AciI#l-264C 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 6 

CNGGTTCGAC TGATCTAGCT GGGGC GAG AC CGGCACGAGG CGACAGTTAC 50 
CAGTACCTGA CAGACAGGCC GATCGAGCCA AACCGTAGTG AGGACGCAGG 100 

5 AGGAACAGGC AGATGCATCT AATGATACCC GCGGAGTATA" TCTCCAACGT 150 
GATATATGAA GGTCCGCGTG CTGACTCATT GTATGCCGCC GACCAGCGAT 20 0 
TGCGACAATT AGCTGACTCA GTTAGAACGA CTGCCGAGTC GCTCAACACC 250 
ACGCTCGACG AGCTGCACGA GAACTGGAAA GGTAGTTTCA 290 
(2) INFORMATION FOR SEQ ID NO: 67 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1306 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#2-92 
20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 7 



GTGATACAGG 


AGGCGCCAAC 


AGTGACACCT 


CGCGGGCCAG 


GTCGTTTGCA 


50 


ACGCTTGTCG 


CAGTGCAGGC 


CTCAGCGCGG 


CTCCGGAGGG 


CCTGCCCGTG 


100 


GTCTTCGACA 


GCTGGCGCTC 


GCAGCAATGC 


TGGGGG C ATT 


GGCCGTCACC 


150 


GTCAGTGGAT 


GCAGCTGGTC 


GGAAGCCCTG 


GGCATCGGTT 


GGCCGGAGGG 


200 


CATTACCCCG 


GAGGCACACC 


TCAATCGAGA 


ACTGTGGATC 


GGGGCGGTGA 


250 


TCGCCTCCCT 


GGCGGTTGGG 


GTAATCGTGT 


GGGGTCTCAT 


CTTCTGGTCC 


300 


GCGGTATTTC 


ACCGGAAGAA 


GAACACCGAC 


ACTGAGTTGC 


CCCGCCAGTT 


350 


CGGCTACAAC 


ATGCCGCTAG 


AGCTGGTTCT 


C AC CGT CAT A 


CCGTTCCTCA 


400 


TCATCTCGGT 


GCTGTTTTAT 


TTCACCGTCG 


TGGTGCAGGA 


GAAGATGCTG 


450 


CAGATAGCCA 


AGGATCCCGA 


GGTCGTGATT 


GATATCACGT 


CTTTCCAGTG 


500 


GAATTGGAAG 


TTTGGCTATC 


AAAGGGTGAA 


CTTCAAAGAC 


GGCACACTGA 


550 


CCTATGATGG 


TGCCGATCCG 


GAGCGCAAGC 


GCGCCATGGT 


TTCCAAGCCA 


600 


GAGGGCAAGG 


ACAAGTACGG 


CGAAGAGCTG 


GTCGGGCCGG 


TGCGCGGGCT 


650 


CAACACCGAG 


GACCGGACCT 


ACCTGAATTT 


CGACAAGGTC 


GAGACGTTGG 


700 


GCACCAGCAC 


CGAAATTCCG 


GTGCTGGTGC 


TGCCGTCCGG 


CAAGCGTATC 


750 


GAATTCCAAA 


TGGCCTCAGC 


CGATGTGATA 


CACGCATTCT 


GGGTGCCGGA 


800 


GTTCTTGTTC 


AAGCGTGACG 


TGATGCCTAA 


CCCGGTGGCA 


AACAACTCGG 


850 


TCAACGTCTT 


C C AG AT CGAA 


GAAATCACCA 


AGACCGGAGC 


ATTCGTGGGC 


900 
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(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: HinPI#l-3 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 70* 



AGATCNAYAC 


YANCANCANT 


GCNGTCATCG 


AGNTGCTGCA 


GGNCANGGTG 


50 


GTCCGTTGGC 


GAACGTGCTN 


KGCCNAYACC 


GGTGCCTTCT 


CGGCGCNCTN 


100 


GGYGCAYNGC 


GACCAGCTGA 


TCGGCGNAKG 


TAATCACCAA 


CCTCAANNKC 


150 


GGTGCTNGCK 


ACCKTCGAYK 


GCAAAGAGYG 


YGCAATTTGT 


CGGCCAGTGT 


200 


CGACCAGCTG 


CAGCAGCTGG 


TCAGCGGCCT 


GGCCAAGAAC 


CGGGATNCCG 


250 


ANTSGNGGGC 


GCCATTTCGC 


CGCTGGNGTC 


GACGACGACG 


GATCTTWCGG 


300 


AACTGTTGCG 


GAATTSGCGC 


CGGCCGCTGC 


AAGGCAKCCT 


GGAAAACGCC 


350 


CGGCCGCTGG 


CTACCGAGCT 


GG AC AACCG A 


AAGGCCNANG 


GTCAASAACG 


400 


RRATCGAGCA 


NGCTCGGCGA 


GGACNATNCC 


TGCGCCTGTC 


CGCGCTGGGC 


450 


AGTTACGGAG 


CANTTCGTTC 


AACATCTAST 


TSTGCTCGGT 


GACGATSAAG 


500 


ATCAACGGAC 


CGGCCGGCAG 


CGACANTCCN 


TGCTGCCGAT 


CGGCGGCCAG 


550 


CCGGANTCCC 


AG CAAGGGG A 


GGTGCGCCTT 


TGCNTAAATA 


GGAAGCCAAG 


600 


TANG CAAAS A 


CGAASGCSAC 


CCGTCCGCAC 


CGGNCATCTT 


CGGCCTGGTG 


650 


CNTGGTGATC 


NTGNCGTCGT 


CCTGATSGNC 


ATTCGGCTAC 


AGCGGGTTGC 


700 


CTKTCTGGCC 


ACAKKKCAAA 


ACCTACGACG 


CGTATTTCAC 


CGACGCCGGT 


750 


GGGATCACCC 


CCGGTAACTC 


GGTTTATGTS 


TCGGGCCTCA 


AGGTGGGCG 


799 



(2) INFORMATION FOR SEQ ID NO: 71 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 713 

(B) TYPE: nucleic ,acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
30 (vi) ORIGINAL SOURCE : 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-827 translation strand 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO 71 

35 CTAYCSGCAA NGCTKNGCAG ACGCTCGGCT GCACNGCAGA ANTCGCGGTG 50 
CACCCACGAT TGCCAGTAGC GCGGGCCCAC TCGTGCCTAC TACACTTCGT 100 
CGTAGCCAAA TCANTCGGCC CCGTAGTATC TCCGGAGATG ACAGATGAAT 150 
GTCGTCGACA TTT CNGNCGG TGGCAGTTCG GTATCACCAC CGTSTATCAC 20 0 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1041 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : doub 1 e 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
10 CD) OTHER INFORMATION: HinPI#l-31 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 6 9 





GKTCNCGGTG 


ATGTCGACNG 


TCGGCACGRM 


GNCGAAACCT 


CANCGGTCGA 


50 




CAGTGTCTGC 


CCGAGGCCGC 


AGCCGACGTG 


CCCCNGGAGA 


CCGCGCGCCA 


100 




ANCACGGTGC 










i cn 
1 jU 


15 


GTAGATGTTT 


TCCTGCACGG 


CGTNCSCGGT 


GAACCCTCCG 


GCGCCAGCAC 


200 




. CGSCACCWNT 


TCCCGCGTCC 


ACGTCGGCCT 


GGGTGGTGAC 


GCCGAGCACC 


250 




CCACCGAAAT 


GATCGACATG 


GCTGTGGGTG 


TAGATGACCG 


SCGACCACGG 


300 




GGCGGTCGGC 


TCCGCGGTGG 


GCGCGANTAC 


AAGTCCAGCG 


CGGCGGCGGC 


350 




CACCTCGGTG 


GACANCCAAN 


CGGGYNYGAT 


GACGARWCWG 


CCCAGTGTCA 


400 


20 


CCNCWMMACG 


AAGNCTGATA 


TTGGAGATAT 


CGAATCCGCG 


GACCTGATAG 


450 




ATGCCCGGCA 


CCACCTGGTA 


GAGGCCCTGT 


TTCGCGGTCA 


GCTGGGATTG 


500 




CCGCCACAGG 


CTGGGATGCA 


CCGAtGTCGG 


CGCGGCACCG 


TCGAGNAACG 


550 




AGTACGCGTC 


GTTGTCCCAC 


ACCNACGCGA 


CCATCGGCAG 


CCTTGATCAC 


600 




ACACGGGGAC 


AGCGCGGCAA 


TGAATCCGCG 


ATCGGCGTCG 


TCGAAATCCG 


650 


25 


TTGTGTCATN 


GCAACGGTNA 


ACGAGTGTTC 


ACCGTGTGCC 


GCCTGGNATG 


700 




ACGGCAGTNG 


GGAGGTTTGT 


GTTCCATCGG 


CACTACATTG 


CCACTACTAC 


750 




GGTGCACGCC 


GGTAGATGCC 


GTTGGCGAAC 


CACGCTACCG 


ACCAGAAAGA 


800 




GAGAATTTTC 


CGCCGCACCT 


AGACCTCGGG 


CCCTCTAACG 


CGCATACTGC 


850 




CGAAGCGGTC 


CTCAATGCCG 


ATGGACCGCT 


ACGACAGGCA 


AAGGAGCACA 


900 


30 


GGGTGAAGCG 


TGGACTGACG 


GNTCGCGGTA 


GCCGGAGCCG 


CCATTCTGGT 


950 




CGCAGGTCTT 


TCCGGATGTT 


CAAGCAACAA 


GTCGACTACA 


GGAAGCGGTG 


1000 




AGACCACGNA 


CCGCGNGCAG 


GCACGACNGC 


AAGCCCCGGC 


G 


1041 



(2) INFORMATION FOR SEQ ID NO: 70 
(i) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 7 99 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-874 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 73 

GTGATGCCTT CCAGCATTGG ATTGGTCGTC GGTTCGATGC TGTGGCGACA 50 
5 GATAAACCGC CTGTTCGGGG TGCGTGGCCT CTGCTGGGCA GCGCACTGCT 100 
CAACGCCGdT CTGCGCTGCT GTGCATGGTG GCCGAGTCGT GTGGGCAGTG 150 
GGTTCACGCC TGGGCGTACT TCACGGCGTT CCTGCTGGCT ACGGTGGCCG 200 
CTCAAACGGT GGTCGCCGCA TCGATATCGT GGATCAGCGT CCTCGCGCCC 250 

GA 252 

10 (2) INFORMATION FOR SEQ ID NO: 74 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
»5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 
20 (D) OTHER INFORMATION: Acil#2-1018 . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 74 

GGCGCCGCCG TCGTGCTGGC CGCCCGGCCC GGTGGGGGTG CCGGCCAGCG 50 
TGGTTCCGCC AGTGGCCGCG CCGAACGTAT TGGCCGGCGT CCTCGAGCAC 100 
GACAACGACG GGTCGGGGGC GGCGGTGCTG GCCGCGCTGG CCAAGCTGCC 150 
25 ACCCGGTGGT 160 
(2) INFORMATION FOR SEQ ID NO : 75 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 93 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
35 (ix) FEATURE: 

(D) OTHER INFORMATION: **HinPI#l-27 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 75 
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TTNCAWYTTC GTNACSYGYT GACCWWCGGC CTGGCNCNCC TKSTKANYRC 250 
GGNTCNAYGC AAACTGCTGT GGTCGTCACC GATAANCCCG CCTGGTATCG 3 00 
CCTCACCNAA ATTCTTCGGC AAATTGTTCC TGNATCNAAC NTTTGCCATC 350 
GGCGTGGCGA CCGGAATCGT GCAGGNAATK TCAGTTCGGC ATGAACTGGA 4 00 

5 GCGAGTACTC CCGATTCGTC GGCGATGTCT TCGGCGCCCC GCTGGCCATG 4 50 
GAGNSCTGGC GGCCTTNCTT CTTCGAATCC ACCTTCATCG GGTTGTGGAT 500 
CTTCGGCTGG AACAGGCTGC CCCGGCTGGT GCANTCTNGG CCTGCATCTG 550 
GNATCGTCGC AATNCGCNGG TNCAACGTGT CCGCGTTCTT CATCATCGCN 6 00 
GGCAAACTCC TTCATGCAGC ATCCGGTCGG CGCGCACTAC AACCCGACCA 650 

10 CCGGGCGTGC CGAGTTGAGC AGCATCGNTC NGTGNCNTGC TGACCAACAA 700 
CACCGCACAG GCG 713 
(2) INFORMATION FOR SEQ ID NO: 7 2 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 274 
is (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

20 (A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(D) OTHER INFORMATION: AciI#2-834 translation strand 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 7 2 

CCGCAGCACC GAGGCAAGCA TCGCACCCGT CGATTCCCGC CATCCCGGCG 50 
25 ACATGATGGT CATGTCCGAC ACCGACGCCC GCACCTCGCT TCCCGAGTTG 100 
ACCGCGCTGC GCGTGGACGC CGCAACGGAT GCGTCGGTTC ATTCGATCCC 150 
GGCTCGAAAT TGGCCATGGC GAACGCATCT TGCTGTGATG GTT CGGGCAG 200 
TAGATCTCCA CTGCCGCACT GATAAACTCG GGTCATGGTC GTCGTGAGGC 250 
GGACAGGGTA GAGGCGCATG ACCG 274 
30 (2) INFORMATION FOR SEQ ID NO: 7 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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I claim; 

1. An isolated Mycobacterium tuberculosis nucleic acid sequence including a sequence selected from the group 
consisting of Seq. I.D. Nos. 1 - 76. 

5 2. A purified immunostimulatory peptide encoded by a sequence according to claim 1. 

3. An antibody that specifically binds to a peptide according to claim 2. 

4. A vaccine preparation comprising at least one immunostimulatory peptide according to claim 2 and a 
10 pharmaceutically acceptable excipiem. 

5. A purified immunostimulatory peptide encoded by a nucleotide sequence selected from the group consisting 
of Scq. I.D. Nos. 1 - 76. 

15 6 A vaccine Preparation comprising at least one peptide according to claim 5 and a pharmaceutically acceptable 

excipient. 

7. A purified immunostimulatory Mycobacterium tuberculosis peptide, the peptide including at least 5 contiguous 
amino acids encoded by a nucleic acid sequence selected from the group consisting of Seq. I.D. Nos. 1 - 76. 

20 

8. A vaccine preparation comprising at least one peptide according to claim 7 and a pharmaceutically 
acceptable excipient. 

9. A peptide according to claim 7 wherein the peptide includes at least 10 contiguous amino acids encoded by a 
25 nucleic acid sequence selected from the group consisting of Seq. I.D. Nos. 1 - 76. 

10. A vaccine preparation comprising at least one peptide according to claim 9 and a pharmaceutically 
acceptable excipient. 

^ 11. A method of making a vaccine comprising: 

providing at least one purified peptide encoded by a nucleotide sequence selected from the group consisting of 
Seq. ID. Nos 1 - 76; 

combining the peptide with a pharmaceutically acceptable excipient. 

35 12 " An isolaicd nuclcic acid molecule having a nucleotide sequence selected from the group consisting of: 

(a) Seq. ID Nos. 1 - 76; 

(b) nucleotide sequences complementary to a sequence defined in (a); and 

(c) nucleic acid molecules of at least 15 nucleoiides in length which hybridize under conditions of at least 75% 
stringency to a sequence defined in (a) or (b). 



40 



13. A recombinant DKA vector including a nucleic acid molecule according to claim 12. 



14. A transformed cell containing a vector according to claim 13. 
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ATCAGCCGCG GGTCGACGCC GCCGATGACC TCGACGTCGT CGTCGTCGCT 50 
GCCGGTACTC AATCCAATCA CCATCCTCTT ACGCACCTTC TAGGAGTGTG 100 
TTGCTGCGGC AGTGCCGGCC ATTCGTAGAT TCGGGCCTCG GCGTTGTCGT 150 
AGATCTTCGC CCACGACCTC GATGTCTCTA ACGACACTAG TCCGTCCGGC 200 

5 ACGCAAACCC CGCACCGTCG GAGTGCTGGT CAGGTATAGA - CGGTACAGGA 250 
GGACTTGGTA GGCCTCGAGT ACCGAGGTAC GTCTCCCGTT GCGGCATAGG 3 00 
CCAGAAGATG AACCGGTGTA GACCGGGCCT GTTGCGAGGG TCGTAGTCGT 3 50 
AGGTCCCAGA GGTGTCGGAC GCCCAGGTTA ATACACAGCG TGC 3 93 

(2) INFORMATION FOR SEQ ID NO: 76 

10 (i) SEQUENCE CHARACTERISTICS: 
(Ai LENGTH: 24 6 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
(ix) FEATURE: 

(D) OTHER INFORMATION: #2-14 7 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 7 6 

GCAGACCTCT GGCCGCTGGT GGTGCTGGGT ACCTGCGCTG GCGACACCGG 50 
ACCGCAGACC GTCAATCGGG ACTCCCGGGA ACGTGGTGCC ATCTTGCCAC 100 
GGGGATGGCC GACGCGGCTC GTCATTCTCC CCGAGCGCAC CGGCCGCCGC 150 
TGTTGACCGG GCCGCGGCGA CTGATGGTGC CCGCACACGC GGGCGGGTTC 200 

25. AAGGAGCAAT ACGCCAAGTC CAGCGCCGCT CTCGCACGGC GCGGTGTT 24 8 
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15. A nucleic acid probe comprising a nucleic acid molecule according to claim 12 and a diagnostic label. 

16. A method of isolating a Mycobacterium tuberculosis gene which gene encodes an immunostirnulatory 
peptide, the method comprising the steps of: 

5 providing nucleic acids of Mycobacterium tuberculosis: 

contacting said nucleic acids with a probe or primer, the probe or primer comprising at least 15 
contiguous nucleotides of a polynucleotide having a nucleotide sequence selected from the group consisting of Seq. ID 
Nos. 1 - 76 and sequences complementary thereto; and 

isolating the Mycobacterium tuberculosis gene. 

10 

17. An isolated Mycobacterium tuberculosis gene produced by the method of claim 16. 



18. An isolated Mycobacterium tuberculosis nucleic acid molecule, said molecule encoding an 
immunostirnulatory peptide and hybridizing under conditions of at least 75% stringency to a nucleic acid probe 
comprising at least 20 contiguous bases of a sequence selected from Seq. ID Nos. 1 - 76. 

19. A purified immunostirnulatory peptide encoded by the nucleic acid molecule of claim 18. 



20. An immunostirnulatory preparation comprising: 
a purified peptide according to claim 19; and 
a pharmaceutical ly acceptable excipient. 



21. An improved tuberculin skin test, the improvement comprising the use of one or more immunostirnulatory 
peptides according to claim 19. 

25 

22. A vaccine preparation comprising an immunostirnulatory membrane peptide isolated from Mycobacterium 
tuberculosis and a suitable excipient. 



23. A method of detecting the presence of Mycobacterium tuberculosis DNA in a sample comprising contacting 
30 the sample with a nucleic acid probe according to claim 15 and detecting hybridization products that include the nucleic 
acid probe. 



24. A method of detecting the presence of Mycobacterium tuberculosis DNA in a sample comprising: 
selecting two or more nucleic acid primer molecules from the nucleic acid molecules defined in claim 12. said 
35 molecules suitable for amplification of a Mycobacterium tuberculosis target sequence; 

incubating the sample under conditions suitable to amplify the target sequence; and 
detecting an amplified product. 



40 



25. A method of detecting the presence of a Mycobacterium tuberculosis peptide in a sample comprising 
contacting the sample with an antibody according to claim 3 and detecting the presence of an antibody -peptide complex. 

26. A method of detecting the presence of an zm\~Mycobacterium tuberculosis antibody in a sample comprising 
contacting the sample with a peptide according to claim 2 and detecting the presence of an antibody-peptide complex. 
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